Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
From: Paul E. McKenney
Date: Fri Nov 28 2025 - 15:28:35 EST
Sorry to be slow, USA Turkey Day and all that...
On Wed, Nov 26, 2025 at 04:26:37PM +0100, Borislav Petkov wrote:
> On Wed, Nov 26, 2025 at 03:32:19PM +0100, Borislav Petkov wrote:
> > Hi,
> >
> > this is latest Linus + latest tip/master. Box is Zen3. CCing AMD IOMMU folks
> > because the backtrace points to it.
> >
> > Ideas?
> >
> > [ 12.946913] (journald)[506]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
> > [ 12.948083] (journald)[506]: Successfully forked off '(sd-mkuserns)' as PID 507.
> > [ 12.977579] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
> > [ 12.983638] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x1/.
This one of course is a stall on CPU 0. But you knew that already.
Also, it looks like you have CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20 or maybe
booted with rcupdate.rcu_exp_cpu_stall_timeout=20 on a system with HZ=250?
Or set rcu_exp_cpu_stall_timeout=20 via sysfs?
> And as suspected, booting in it again, it doesn't trigger anymore. But there's
> something new in dmesg which looks weird and makes me want to Cc Paul:
>
> [ 6.965526] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
This is the beginning of the message.
> [ 6.971581] Key type fscrypt-provisioning registered
> [ 6.975191] PM: Image not found (code -6)
> [ 6.975631] } 8 jiffies s: 89 root: 0x0/.
And this is the end. This looks like the stall ended just as the
stall-warning message started printing.
> and
>
> [ 12.549532] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-.... } 8 jiffies s: 113 root: 0x1/.
> [ 12.550863] rcu: blocking rcu_node structures (internal RCU debug):
This is a stall on CPU 2.
>
> [ 12.817601] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
> [ 12.819773] (sd-mkdcre[520]: Credential search path is: /etc/credstore:/run/credstore:/usr/local/lib/credstore:/usr/lib/credstore
> [ 12.827074] } 8 jiffies s: 129 root: 0x0/.
>
> [ 12.881508] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
> [ 12.892854] (sd-mkdcr[522]: Credential search path is: /etc/credstore.encrypted:/run/credstore.encrypted:/usr/local/lib/credstore.encrypted:/usr/lib/credstore.encrypted
> [ 12.905244] } 8 jiffies s: 133 root: 0x0/.
>
> Paul, this looks weird.
>
> Why is that issuing empty lists between the { }?
Again, my guess is that the stall is ending just as the print starts.
It also looks like you have the expedited stall warning set to 20
milliseconds, which as far as I know is used only on constrained systems
such as smartphones. If you set this value on a typical large server,
you will get very large numbers of expedited RCU CPU stall warnings.
Oh, and if you are running with HZ=1000 and the expedited RCU CPU stall
warning set to 20 milliseconds (let alone 8!), then as far as I know,
you are a pioneer breaking new ground. ;-)
Thanx, Paul
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette