Re: [BUG -next] WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0xb3/0x100

From: Petr Mladek

Date: Thu Nov 13 2025 - 02:37:22 EST


Hi Paul,

first, thanks a lot for reporting the regression.

On Wed 2025-11-12 16:52:16, Paul E. McKenney wrote:
> Hello!
>
> Some rcutorture runs on next-20251110 hit the following error on x86:
>
> WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0xb3/0x100, CPU#0: rcu_torture_sta/63
>
> This happens in about 20-25% of the rcutorture runs, and is the
> WARN_ON_ONCE(1) in the "else" clause of get_data(). There was no
> rcutorture scenario that failed to reproduce this bug, so I am guessing
> that the various .config files will not provide useful information.
> Please see the end of this email for a representative splat, which is
> usually rcutorture printing out something or another. (Which, in its
> defense, has worked just fine in the past.)
>
> Bisection converged on this commit:
>
> 67e1b0052f6b ("printk_ringbuffer: don't needlessly wrap data blocks around")
>
> Reverting this commit suppressed (or at least hugely reduced the
> probability of) the WARN_ON_ONCE().
>
> The SRCU-T, SRCU-U, and TREE09 scenarios hit this most frequently at
> about double the base rate, but are CONFIG_SMP=n builds. The RUDE01
> scenario was the most productive CONFIG_SMP=y scenario. Reproduce as
> follows, where "N" is the number of CPUs on your system divided by three,
> rounded down:
>
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 5 --configs "N*RUDE01"
>
> Or if you can do CONFIG_SMP=n, the following works, where "N" is the
> number of CPUs on your system:
>
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 5 --configs "N*SRCU-T"
>
> Or please tell me what debug I should enable on my runs.

The problem was reported by two test robots last week. It happens when
a message fits exactly up to the last byte before the ring buffer gets
wrapped for the first time. It is interesting that you have seen
so frequently (in about 20-25% rcutorture runs).

Anyway, I have pushed a fix on Monday. It is the commit
cc3bad11de6e0d601 ("printk_ringbuffer: Fix check of
valid data size when blk_lpos overflows"), see
https://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git/commit/?h=for-6.19&id=cc3bad11de6e0d6012460487903e7167d3e73957

Thanks a lot for so exhaustive report. And I am sorry that you
probably spent a lot of time with it.

Best Regards,
Petr