Re: [BUG] KASAN: slab-out-of-bounds in vsnprintf triggered by large stack frame

From: Kees Cook
Date: Wed Jul 02 2025 - 10:49:30 EST


On Wed, Jul 02, 2025 at 02:00:55PM +0200, Petr Mladek wrote:
> Adding Kees and linux-hardening mailing list into CC just to be sure.
>
> But I think that this is a bogus report, see below.
>
> On Tue 2025-07-01 22:11:55, Shardul Bankar wrote:
> > Hello,
> >
> > I would like to report a slab-out-of-bounds bug that can be reliably
> > reproduced with a purpose-built kernel module. This report was
> > initially sent to security@xxxxxxxxxx, and I was advised to move it to
> > the public lists.
> >
> > I have confirmed this issue still exists on the latest mainline kernel
> > (v6.16.0-rc4).
> >
> > Bug Summary:
> >
> > The bug is a KASAN-reported slab-out-of-bounds write within vsnprintf.
> > It appears to be caused by a latent memory corruption issue, likely
> > related to the names_cache slab.
> >
> > The vulnerability can be triggered by loading a kernel module that
> > allocates an unusually large stack frame. When compiling the PoC
> > module, GCC explicitly warns about this: warning: the frame size of
> > 29760 bytes is larger than 2048 bytes. This "stack grooming" positions
> > the task's stack to overlap with a stale pointer from a freed
> > names_cache object. A subsequent call to pr_info() then uses this
> > corrupted value, leading to the out-of-bounds write.
>
> Honestly, I think that everything works as expected.
> I do not see any bug with the existing kernel code.
> IMHO, the bug is in the test module, see below.
>
> > Reproducer:
> >
> > The following minimal kernel module reliably reproduces the crash on my
> > x86-64 test system.
> >
> > #include <linux/init.h>
> > #include <linux/module.h>
> > #include <linux/printk.h>
> >
> > #define STACK_FOOTPRINT (3677 * sizeof(void *))
> >
> > static int __init final_poc_init(void)
> > {
> > volatile char stack_eater[STACK_FOOTPRINT];
> > stack_eater[0] = 'A'; // Prevent optimization
>
> This takes the whole stack.

Way more than the whole stack. :) That's 29416 bytes and the default
stack is 8192 on x86_64. (Well, here it's actually 16K due to KASAN,
I think.) So this is well past the bottom of the stack. And since the
kernel builds with -fno-stack-clash-protection, we don't see a stack
probing crash as the stack usage crosses into the guard page. This is
the same as just doing:

static int __init final_poc_init(void)
{
volatile char stack_eater;
*(&stack_eater + STACK_FOOTPRINT) = 'A';
...

Try this and see how the crash changes:

static int __init final_poc_init(void)
{
volatile char stack_eater[STACK_FOOTPRINT];
for (int i = STACK_FOOTPRINT - 1; i >= 0; i++)
stack_eater[i] = 'A';
...

:)

> > pr_info("Final PoC: Triggering bug with controlled stack
> > layout.\n");
>
> And any function called here, which would need to store return
> address on the stack would fail.
>
> The compiler warned about it.
> KASAN caught and reported the problem.
>
> The solution is to listen to the compiler warnings and
> do not create broken modules.

I would agree.

> > [ 214.242355] Call Trace:
> > [ 214.242356] <TASK>
> > [ 214.242359] ? console_emit_next_record+0x12b/0x450
> [...]
> > [ 214.242573] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [ 214.242575] </TASK>

I would also note that the _entire_ trace is bogus too -- all the
leading "?" lines means it's just guessing based on what was left over
in memory rather than a sane dump.

> > This is my first time reporting a bug on the mailing list, so please
> > let me know if any additional information or formatting is required.

I'd repeat what Petr said, which is: if the compiler is emitting
warnings, then it's likely the bug is not with the core kernel. :)

-Kees

--
Kees Cook