Re: [syzbot] [kernel?] upstream test error: KMSAN: uninit-value in irqentry_exit_to_kernel_mode_preempt

From: Alexander Potapenko

Date: Tue May 12 2026 - 12:24:51 EST

On Tue, May 12, 2026 at 1:15 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> On Tue, May 12, 2026 at 11:33:47AM +0200, Alexander Potapenko wrote:
> > On Mon, May 11, 2026 at 2:25 PM Thomas Gleixner <tglx@xxxxxxxxxx> wrote:
> > > > irqentry_exit_to_kernel_mode_preempt() is now checking for both `regs`
> > > > and `state`.
> > > > Because there is a lot of non-instrumented code around, we fail to
> > > > initialize these variables.
> > > > Instead, irqentry_enter_from_kernel_mode() explicitly calls
> > > > kmsan_unpoison_entry_regs(regs) to take care of the registers, but not
> > > > the state.
> > > > We should probably call `kmsan_unpoison_memory(&state, sizeof(state))`
> > > > at the same place.
> > >
> > > Good luck with unpoisoning 'state'. 'state' is not memory to begin with.
> >
> > My bad. Normally, the compiler would happily move `state` to memory if it is
> > address-taken, and make sure that its shadow is tracked properly.
> > But not in this case, because irqentry_enter_from_kernel_mode() is inlined into
> > a `noinstr` function.
> >
> > <I omit the elaborate assembly analysis here, tipping my hat!>
> >
> > > Again. 'state' is a pure register value, which is handed to
> > > irqentry_exit() and irqentry_exit_to_kernel_mode_preempt().
> >
> > There is no notion of a 'pure register value' in C, and the compiler may make
> > arbitrary decisions about whether a particular value is stored on the stack or
> > in the registers.
> >
> > Luckily, KMSAN does not have to know about that, because it works on the LLVM IR
> > level and can track the state of a value regardless of where it is stored.
> > In particular, it normally works for function return values - unless `noinstr`
> > kicks in.
> >
> > >
> > > But KMSAN magically associates a memory access which it and then claims
> > > it belongs to a SKB which was allocated in the interrupted code.
> > >
> > > What Mark's change actually does is to make the register value 'state'
> > > observable in an instrumented function, while before that 'state' was
> > > always confined in the non instrumentable code.
> >
> > Agreed.
> >
> > > But as that 'state' argument of irqentry_exit_to_kernel_mode_preempt()
> > > is a pure register value, which could be even a constant supplied by the
> > > caller of irqentry_exit(), KMSAN has _ZERO_ business to fiddle with it.
> >
> > I disagree with a general implication that KMSAN has zero business to fiddle
> > with values passed in registers. But I agree we are doing a poor job trying
> > to pull the shadow for `state` out of thin air.
> >
> > > The compiler _cannot_ assume anything about the 'state' argument as
> > > that's handed in as value in RSI from a completely different compilation
> > > unit.
> >
> > Again, this only matters because we are calling an instrumented function from
> > a non-instrumented one, otherwise it's perfectly fine to call between
> > compilation units.
>
> For context, can you explain how this is expected to work across
> compilation units when the caller and callee *are* instrumented, when an
> argument is passed in a register?

Per KMSAN ABI, the metadata for the function arguments is passed via
buffers in the so-called context state (see
include/linux/kmsan_types.h)
In the userspace, these buffers are thread-local variables referenced
by inline loads and stores.
In the kernel space, the compiler inserts a call
__msan_get_context_state() at the beginning of every function, and
then the instrumentation code uses whatever that function returned.

Assume we have a function:

int sum(int a, int b) {
result = a + b;
return result;
}

Its instrumented version looks roughly as follows (we'll omit origin
tracking for simplicity):

int sum(int a, int b) {
struct kmsan_context_state *kcs = __msan_get_context_state();
int s_a = ((int)kcs->param_tls)[0]; // shadow of a
int s_b = ((int)kcs->param_tls)[1]; // shadow of b
result = a + b;
s_result = s_a | s_b;
((int)kcs->retval_tls)[0] = s_result; // returned shadow
return result;
}

Most certainly `a` and `b` will be passed using registers, but that
doesn't matter: their metadata is safe as long as the caller does:

((int)kcs->param_tls)[0] = s_a;
((int)kcs->param_tls)[1] = s_b;
result = sum(a, b);
s_result = ((int)kcs->retval_tls)[0];

All the metadata tracking is implemented using an LLVM IR pass, which
ensures that the shadow for each uninitialized value is tracked
regardless of where that value is stored - be that heap memory, stack
or registers:
- when a value is created or loaded from memory, the compiler inserts
SSA registers corresponding to that value's shadow;
- when a value is written to memory, a shadow store is inserted;
- when a value is used in an operation, the result of that operation
receives a shadow value depending on the operands and their shadow
values;
- when a value is passed to a function, the corresponding context
state stores and loads are emitted.

Now, this works fine under the assumption that all code in the kernel
is instrumented with KMSAN.
However this is not true.

1. There are a few `KMSAN_SANITIZE := n` in the Makefiles: some
prevent infinite recursion caused by KMSAN calling instrumented code;
others disable instrumentation of the code that executes before KMSAN
is fully set up.
2. There are annotations to soft-disable KMSAN for a particular
function by adding a `__no_kmsan_checks` attribute. There will still
be instrumentation, but all the function outputs (stores and returns)
will be initialized, and no errors will be reported.
This is handy when the compiler cannot properly instrument the
underlying code. One example would be tricky inline assembly, another
one - stack walking functions that deliberately read and write
uninitialized values.
3. There are cases in which KMSAN must be disabled for the whole
function (`__no_sanitize_memory`).
We disable instrumentation for `noinstr` functions. Additionally, on
x86 there is exactly one case where this is done to avoid infinite
recursion when storing the origins.

It's important to note that, due to how function attributes are
implemented in Clang, __no_kmsan_checks and __no_sanitize_memory will
be applied to a function if that function exists during the KMSAN
instrumentation pass. If it was inlined before that pass, the compiler
will apply the instrumentation rules for whichever function is calling
the inlined one.

Now, because of the described ABI, calls between instrumented
functions and/or functions marked `__no_kmsan_checks` are perfectly
fine because the function arguments are stored and loaded on both
ends.
What does not work well is calling non-instrumented functions from
instrumented functions, and vice versa.
In the former case, KMSAN will not see the callee's side effects
(return values or memory stores), in the latter case, the callee may
receive incorrect shadow values for the function parameters or memory
stores in the caller. Both will
We have few tools to deal with such cases. It often helps to move the
border between the instrumented and non-instrumented code by applying
__no_kmsan_checks or __no_sanitize_memory until we get to a point
where there are no incorrect shadow arguments.
Another way to deal with uninitialized data coming from
non-instrumented code is kmsan_unpoison_memory(address, size).
Unfortunately, as Thomas pointed out, we can't use it for locals in
non-instrumented code.

> Below you suggest that the caller might add "hints", but it's not clear
> specifically what this means.
>
> > > Something is wrong in KMSAN/compiler land or do you still believe that
> > > you just need to unpoison the non existing memory 'state'?
> >
> > When we call an instrumented function from a non-instrumented one, the compiler
> > is doomed to not understand that and to be unable to track the function
> > parameters properly. Exactly because `noinstr` implies no instrumentation
> > whatsoever, the compiler may not add any hints on the caller side that would
> > help the callee understand what's going on - even if KMSAN is able to see this
> > `noinstr` function (which is not always the case).
> >
> > So what we could do is to add annotations manualy on either the caller side or
> > the callee side.
>
> Without some understanding of those "hints" you mention, I don't see how
> we can do that on the caller side.

In this case by "hints" I meant any kind of instrumentation that the
compiler could have inserted automatically to mark the data in the
instrumented functions as initialized.
There are no such "hints" right now (apart from letting KMSAN
instrument the function).
As for the manual annotations, we only have the mentioned
`__no_kmsan_checks`, `__no_sanitize_memory`, and
kmsan_unpoison_memory().

>
> > We can apply `__no_kmsan_checks` to irqentry_exit_to_kernel_mode_preempt(),
> > making all its inputs initialized. This is the easiest solution, it may
> > introduce false negatives, but we are on a very thin ice anyway, so perhaps
> > doing so is better than dealing with more false positives in the interrupt code.
> >
> > Another option for the callee would be applying `__always_inline`, so that
> > irqentry_exit_to_kernel_mode_preempt() also becomes non-instrumented.
> > Given that irqentry_exit_to_kernel_mode_after_preempt() is already
> > `__always_inline`, it might be the right thing to do.
>
> We can do that, but this really suggests that there's a fundemantal
> inability to pass arguments between code which is noinstr and code which
> isinstrumented with AddressSanitizer, and that's inevitably going to
> bite us in future.

This is true (assuming you mean MemorySanitizer), but:
- I don't think we can avoid that, given that there will always be
non-instrumented code;
- so far the number of annotations has been manageable.

>
> > On the caller side, we could do something creative with instrumentation_begin()
> > and instrumentation_end(). We've had a discussion about that exactly four years
> > ago: https://lore.kernel.org/all/20220426164315.625149-29-glider@xxxxxxxxxx/T/#u
> > , but came to a conclusion that a handful of annotations on the noinstr/instr
> > boundary may do a better job than a solution that doesn't cover all cases.
>
> That doesn't look general at all, so I am not keen on that.

Ack.

> > In particular, the case of irqentry_exit_to_kernel_mode_preempt() could have
> > been solved by `__memset(state, 0, sizeof(struct kmsan_context_state))` in
> > instrumentation_begin(). But it wouldn't solve more complex (yet rare, and
> > non-existing today) cases where two functions are called from an instrumented
> > region, and the first function somehow leaves the argument state poisoned.
>
> How exactly is kmsan_context_state used?

Hope the explanation above helps.

>
> If that's supposed to carry some global or current context, surely
> blatting that in entry code will affect the code that was interrupted?

It will mostly affect the bottom-level function called by the entry
code, after that, nested functions will be just passing their
parameters as per KMSAN ABI.

> I see kmsan_get_context() has an in_task() check, but that can't help
> with nested exceptions, so this doesn't look right at all.

KMSAN context is per-task, and if !in_task(), it is per-CPU.

For the nested exceptions, we conservatively bail out (see
kmsan_in_runtime()) to avoid deadlocking.
This may cause false negatives.

> > Do you think it's worth revisiting the instrumentation_begin() approach, or
> > shall we go with one of the compiler attributes instead?

> I think we need a better understanding of this first.
>
> It looks to me that there are bigger problems here.
>

I'll be happy to discuss this further.

--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg