Re: [PATCH v2] soc: aspeed: lpc-snoop: Fix usercopy overflow in snoop_file_read
From: Andrew Jeffery
Date: Tue May 26 2026 - 23:53:27 EST
Hi Karthikeyan,
On Sat, 2026-05-23 at 17:35 +0000, Karthikeyan KS wrote:
> put_fifo_with_discard() violates kfifo's single-producer/single-consumer
> lock-free contract by calling both kfifo_skip() (consumer op: out++) and
> kfifo_put() (producer op: in++) from the IRQ handler context.
> kfifo_skip() increments 'out' without a memory barrier, while
> kfifo_put() increments 'in' with smp_wmb(). On ARM (weakly ordered), a
> concurrent reader on another CPU can observe the new 'in' but a stale
> 'out', causing (in - out) to exceed the buffer size.
>
> __kfifo_to_user() clamps the copy length to (in - out), but since that
> value is already corrupted by the race, the clamp is ineffective. The
> subsequent kfifo_copy_to_user() ring-buffer split produces a second
> chunk exceeding the 2048-byte kmalloc-2k slab object, triggering:
>
> Backtrace:
> [ 2.972611] usercopy: Kernel memory exposure attempt detected from SLUB object 'kmalloc-2k' (offset 0, size 2049)!
> [ 2.974191] ------------[ cut here ]------------
> [ 2.974677] kernel BUG at mm/usercopy.c:99!
> [ 2.975068] Internal error: Oops - BUG: 0 [#1] SMP ARM
> [ 2.975755] Modules linked in: post_injector(O)
> [ 2.976668] CPU: 0 PID: 1 Comm: init Tainted: G O 5.15.178 #4
> [ 2.977316] Hardware name: Generic DT based system
> [ 2.977848] PC is at usercopy_abort+0x80/0xa8
> [ 2.978931] LR is at vprintk_emit+0xf0/0x230
> [ 2.979296] pc : [<8095c3a0>] lr : [<8017cb2c>] psr: 60000153
> [ 2.979781] sp : 810e3dc8 ip : 810e3d38 fp : 810e3dec
> [ 2.980192] r10: 00000003 r9 : 811ea801 r8 : 811ea800
> [ 2.980616] r7 : 00000001 r6 : 00000801 r5 : 00000801 r4 : 00000000
> [ 2.981104] r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : 00000066
> [ 2.981712] Flags: nZCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment none
> [ 2.982363] Control: 00c5387d Table: 81a64008 DAC: 00000051
> [ 2.982881] Register r0 information: non-paged memory
> [ 2.983662] Register r1 information: NULL pointer
> [ 2.984074] Register r2 information: NULL pointer
> [ 2.984474] Register r3 information: NULL pointer
> [ 2.984879] Register r4 information: NULL pointer
> [ 2.985279] Register r5 information: non-paged memory
> [ 2.985704] Register r6 information: non-paged memory
> [ 2.986118] Register r7 information: non-paged memory
> [ 2.986553] Register r8 information: slab kmalloc-2k start 811ea800 pointer offset 0 size 2048
> [ 2.987789] Register r9 information: slab kmalloc-2k start 811ea800 pointer offset 1 size 2048
> [ 2.988541] Register r10 information: non-paged memory
> [ 2.988971] Register r11 information: non-slab/vmalloc memory
> [ 2.989511] Register r12 information: non-slab/vmalloc memory
> [ 2.990064] Process init (pid: 1, stack limit = 0x041252af)
> [ 2.990631] Stack: (0x810e3dc8 to 0x810e4000)
> [ 2.991119] 3dc0: 80c2ceb4 80c29fd4 80c1f6d4 00000000 00000801 811ea801
> [ 2.992044] 3de0: 810e3e1c 810e3df0 802e3a94 8095c32c 00000801 801168b0 811ea800 811ea800
> [ 2.992707] 3e00: 00000801 00000001 811eb001 00000001 810e3e44 810e3e20 802e8774 802e3978
> [ 2.993336] 3e20: 810e3e44 810e3e30 00000801 00001000 7edc2593 811ea800 810e3e6c 810e3e48
> [ 2.993971] 3e40: 804dedf0 802e8604 811d584c 00001000 810e3e98 7edc1d94 00001000 810e2000
> [ 2.994734] 3e60: 810e3e94 810e3e70 804def50 804ded54 810e3e98 804d3f28 00000000 811d586c
> [ 2.995377] 3e80: 810e3e98 00000001 810e3ed4 810e3e98 80579a58 804def0c 810e3f34 810e3ea8
> [ 2.996024] 3ea0: 80282c74 80282744 00000000 37f2b80b ffffe000 819a2cc0 810e3f68 00000001
> [ 2.996684] 3ec0: 805799a8 00001000 810e3f64 810e3ed8 802ee040 805799b4 00000003 1cd5b000
> [ 2.997310] 3ee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 2.997987] 3f00: 801002c4 37f2b80b 810e2000 810e2000 810e3fb0 00000000 801002c4 80314ce0
> [ 2.998678] 3f20: 00000000 00000003 014fc580 00001000 7edc1d94 37f2b80b 810e2000 819a2cc0
> [ 2.999268] 3f40: 819a2cc0 00000000 00000000 801002c4 810e2000 00000003 810e3f94 810e3f68
> [ 3.000024] 3f60: 802ee740 802edf90 00000000 00000000 00000800 37f2b80b 00000003 014fc580
> [ 3.000688] 3f80: 00006f2c 00000003 810e3fa4 810e3f98 802ee7e4 802ee6dc 00000000 810e3fa8
> [ 3.001364] 3fa0: 80100080 802ee7d8 00000003 014fc580 00000003 7edc1d94 00001000 00000000
> [ 3.002389] 3fc0: 00000003 014fc580 00006f2c 00000003 7edc2f10 00000000 00000001 00000000
> [ 3.003042] 3fe0: 000000ac 7edc1d18 0002c174 0002c190 60000150 00000003 00000000 00000000
> [ 3.003773] Backtrace:
> [ 3.004236] [<8095c320>] (usercopy_abort) from [<802e3a94>] (__check_heap_object+0x128/0x150)
> [ 3.005187] [<802e396c>] (__check_heap_object) from [<802e8774>] (__check_object_size+0x17c/0x1e0)
> [ 3.005996] r8:00000001 r7:811eb001 r6:00000001 r5:00000801 r4:811ea800
> [ 3.006555] [<802e85f8>] (__check_object_size) from [<804dedf0>] (kfifo_copy_to_user+0xa8/0x1b8)
> [ 3.007290] r7:811ea800 r6:7edc2593 r5:00001000 r4:00000801
> [ 3.007766] [<804ded48>] (kfifo_copy_to_user) from [<804def50>] (__kfifo_to_user+0x50/0x70)
> [ 3.008418] r9:810e2000 r8:00001000 r7:7edc1d94 r6:810e3e98 r5:00001000 r4:811d584c
> [ 3.009031] [<804def00>] (__kfifo_to_user) from [<80579a58>] (snoop_file_read+0xb0/0xf8)
> [ 3.009683] r6:00000001 r5:810e3e98 r4:811d586c
> [ 3.010055] [<805799a8>] (snoop_file_read) from [<802ee040>] (vfs_read+0xbc/0x328)
> [ 3.010686] r8:00001000 r7:805799a8 r6:00000001 r5:810e3f68 r4:819a2cc0
> [ 3.011182] [<802edf84>] (vfs_read) from [<802ee740>] (ksys_read+0x70/0xfc)
> [ 3.011762] r10:00000003 r9:810e2000 r8:801002c4 r7:00000000 r6:00000000 r5:819a2cc0
> [ 3.012451] r4:819a2cc0
> [ 3.012677] [<802ee6d0>] (ksys_read) from [<802ee7e4>] (sys_read+0x18/0x1c)
> [ 3.013261] r7:00000003 r6:00006f2c r5:014fc580 r4:00000003
> [ 3.013724] [<802ee7cc>] (sys_read) from [<80100080>] (ret_fast_syscall+0x0/0x48)
> [ 3.014330] Exception stack(0x810e3fa8 to 0x810e3ff0)
> [ 3.014767] 3fa0: 00000003 014fc580 00000003 7edc1d94 00001000 00000000
> [ 3.015379] 3fc0: 00000003 014fc580 00006f2c 00000003 7edc2f10 00000000 00000001 00000000
> [ 3.016030] 3fe0: 000000ac 7edc1d18 0002c174 0002c190
> [ 3.016641] Code: e1cd40fc e58dc000 e59f0024 ebfff741 (e7f001f2)
> [ 3.017733] ---[ end trace d9f7e472f48076c9 ]---
> [ 3.018380] Kernel panic - not syncing: Fatal exception
> [ 3.019190] ---[ end Kernel panic - not syncing: Fatal exception ]---
Can you please trim this a bit? See the discussion of backtraces here:
https://docs.kernel.org/process/submitting-patches.html#explanation-body
>
>
> This is reproducible on AST2500 BMC (dual-core ARM Cortex-A7) when
The AST2500 has a (single-core) ARM1176JZS, not a dual-core Cortex-A7.
> BIOS floods POST codes while userspace concurrently reads
> /dev/aspeed-lpc-snoop0.
>
> Fix by:
>
> 1. Clamping 'count' to SNOOP_FIFO_SIZE in snoop_file_read() so that
> copy_to_user() can never exceed the slab object boundary regardless
> of kfifo pointer state.
I don't think we shouldn't be double-accounting for the bug. Let's just
get the locking right given the observation that we have multiple
consumers.
>
> 2. Protecting the kfifo_skip() + kfifo_put() sequence in
> put_fifo_with_discard() and the kfifo_to_user() call in
> snoop_file_read() with a spinlock, ensuring pointer updates are
> atomic with respect to concurrent access and restoring the
> single-writer invariant for each pointer.
This appears sensible.
>
> put_fifo_with_discard() is only called from hardirq context where
> interrupts are already disabled, so plain spin_lock/spin_unlock is
> used. snoop_file_read() runs in process context and must use
> spin_lock_irqsave to prevent deadlock with the IRQ handler.
We only need _irqsave for generic code that runs in either context, but
the fops callbacks are never run in irq context, so we don't have that
ambiguity. You could make other choices here, however _irqsave isn't
wrong.
>
> Signed-off-by: Karthikeyan KS <karthiproffesional@xxxxxxxxx>
Can you please add a Fixes: tag?
> ---
> Andrew,
>
> You're right — __kfifo_to_user() clamps to (in - out). The issue is
> that (in - out) itself is corrupted by a race in put_fifo_with_discard().
>
> The function calls kfifo_skip() (out++) then kfifo_put() (in++) without
> synchronization. On SMP ARM, the reader observes fresh 'in' (barrier in
> kfifo_put) but stale 'out' (no barrier in kfifo_skip), making (in - out)
> exceed the buffer size. The clamp then passes through the corrupted value.
>
> Reproduced on QEMU ast2500-evb by injecting POST codes and simulating
> the race outcome.
>
Can you provide more details? The 2500 is single-core, so is either
executing in interrupt context or not, and the values should be
consistent after completing the interrupt.
> Also observed intermittently on physical dual-core
> AST2600 under heavy POST code traffic.
The AST2600 has a dual-core Cortex-A7, so your bug makes more sense to
me there...
Andrew