Re: [syzbot] [kernel?] upstream test error: KMSAN: uninit-value in irqentry_exit_to_kernel_mode_preempt

From: Thomas Gleixner

Date: Thu Jun 25 2026 - 16:54:22 EST

Alexander!

First of all sorry, that I ingored this for almost a month. I was
burried in other stuff including private things which turned out to be
more important.

On Wed, May 27 2026 at 14:39, Alexander Potapenko wrote:
> On Wed, May 13, 2026 at 2:36 AM Thomas Gleixner <tglx@xxxxxxxxxx> wrote:
>> IOW, the producer of a value argument has to ensure that the value
>> argument is properly initialized.
>>
>> This whole 'is initialized' propagation for by value arguments is
>> overengineered voodoo in my opinion. Why?
>
> Sorry, I left out a big chunk of how KMSAN works.
> In fact, the above cases are covered by
> `-fsanitize-memory-param-retval`, which eagerly checks parameters and
> return values having the `noundef` attribute.
> For such cases, the compiler will not propagate the shadow into TLS.

Ah. That makes a lot more sense.

> Yet we still have to maintain TLS parameter passing for types that
> lack the `noundef` attribute (e.g. structs with internal padding or
> vector types) and for variadic function arguments.

Ok.

>> So it just adds overhead for nothing and makes everything look "sane"
>> while in fact it provides no value and creates exactly the problems we
>> are debating right now.
>>
>> No?
>
> I hope the above addresses this concern to some extent.

It does.

> While eager checks simplify the instrumentation code, they (as
> mentioned) do not completely remove the need for TLS parameter
> passing.
> Long story short, it's unsafe to assume at the beginning of an
> instrumented function that all its by-value arguments are initialized.
> If that were true, we would indeed have no problems with
> non-instrumented functions calling instrumented ones.

Right, except for the already covered pointers

>> There were discussions about utilizing instrumentation_begin()/end() to
>> tell the compiler that this instrumentation_begin() lifts the 'do not
>> sanitize directive' and it could rightfully inject instrumentation code
>> there including calls, but obviously nothing happened.
>
> Right. At that point, we thought __no_kmsan_checks should be enough to
> keep the complexity manageable.
> The more I think about it now, the more I am convinced it shouldn't be
> hard to implement intrinsics that
> allow instrumentation between
> instrumentation_begin()/instrumentation_end() and treat all values
> passed into that region as initialized.
> This should fix problems with noinstr functions calling instrumented functions.

Ok.

>> According to the disassembly do_error_trap() just goes and invokes
>> notify_die() without any checks and notify_die() takes the same
>> (reordered) arguments uninspected and stores them into a on stack data
>> struct which is properly poisened first and then unpoisoned for each
>> member stored.
>
> Let's look at the compilation result with `-S -emit-llvm` - it is
> generally easier to comprehend and gives better understanding of
> what's going on with the instrumentation:
>
> ; Function Attrs: fn_ret_thunk_extern noredzone nounwind
> null_pointer_is_valid sanitize_memory
> define internal fastcc void @do_error_trap(ptr noundef %regs, i64
> noundef %error_code, ptr noundef %str, i64 noundef range(i64 0, 13)
> %trapnr, i32 noundef range(i32 4, 12) %signr, i32 noundef range(i32 0,
> 3) %sicode, ptr noundef %addr) unnamed_addr #9 align 16 !dbg !10395 {
> entry:
> %0 = call ptr @__msan_get_context_state() #19, !dbg !11994
> %retval_shadow = getelementptr i8, ptr %0, i64 800, !dbg !11994
> %param_origin = getelementptr i8, ptr %0, i64 3208, !dbg !11994
> %retval_origin = getelementptr i8, ptr %0, i64 4008, !dbg !11994
> #dbg_value(ptr %regs, !10394, !DIExpression(), !11995)
> #dbg_value(i64 %error_code, !10399, !DIExpression(), !11995)
> #dbg_value(ptr %str, !10400, !DIExpression(), !11995)
> #dbg_value(i64 %trapnr, !10401, !DIExpression(), !11995)
> #dbg_value(i32 %signr, !10402, !DIExpression(), !11995)
> #dbg_value(i32 %sicode, !10403, !DIExpression(), !11995)
> #dbg_value(ptr %addr, !10404, !DIExpression(), !11995)
> call void @__sanitizer_cov_trace_pc() #20, !dbg !11994
> %conv = trunc nuw nsw i64 %trapnr to i32, !dbg !11996
> store i32 0, ptr %retval_shadow, align 8, !dbg !11997
> %call = call i32 @notify_die(i32 noundef 8, ptr noundef %str, ptr
> noundef %regs, i64 noundef %error_code, i32 noundef %conv, i32 noundef
> %signr) #21, !dbg !11997
> ...
>
> There are no checks for `@notify_die` arguments because
> `@do_error_trap` receives them as `noundef`.
> If eager checks are enabled, KMSAN pass knows that these arguments
> were already checked previously (see
> https://www.llvm.org/doxygen/MemorySanitizer_8cpp_source.html#l02149),
> so it omits the checks.

Interesting.

> So, if a `noundef` value is passed along a chain of function calls, it
> is checked only once, at the point where we don't yet know that it is
> `noundef`.
> As mentioned, for `noundef` arguments, no shadow is stored in the TLS.
> Thus, even if non-instrumented functions are along the way, their
> callees still assume that the `noundef` argument is initialized (which
> may introduce false negatives, but these are inevitable when
> non-instrumented code is present).
> To achieve the same behavior for the function argument with TLS shadow
> propagation, we'll indeed need to enable instrumentation within the
> instrumentation_begin()/instrumentation_end() region.

Right.

> I think doing so will resolve the issues with instrumented functions
> being called from non-instrumented functions in the same KMSAN
> context, and will not require the call_instr() magic you are
> suggesting.
> So let me finally take a stab at it.

Thank you!

> Note that the issue with incorrect context tracking in [soft]irqs is
> orthogonal to this and will need to be addressed separately.

Ok.

Thanks,

tglx