Re: RCU CPU stall Warnings with JTAG debugging

From: Paul E. McKenney
Date: Wed May 16 2018 - 10:50:49 EST


On Wed, May 16, 2018 at 01:45:45PM +0000, Ehson Hussain wrote:
> Hello Paul,
>
> Hope you're doing great. I'm trying JTAG debugging on an aarch64 SoC (Xilinx Zynqmp zcu102 to be precise); when I halt the cores manually, or hit a breakpoint, I get a CPU stall warning on resuming after the RCU grace period expires.
>
> If I resume quickly enough, I do not get the warning.
>
> My question is, if all cores are stopped during debugging, how does RCU code detect/self-detect stalls on CPU? I have never observed this behavior in older kernels (3.14 and above), with arm cores e.g. armv7 based Freescale iMx6 boards.

My guess is that the jiffies clock is updated after resume to allow for
the time elapsed during the halt. So the system resumes, the jiffies
counter gets a huge increment, and RCU complains about the massive
passage of time.

If you are doing JTAG debugging, I recommend shutting off RCU CPU stall
warnings, for example, using the rcupdate.rcu_cpu_stall_suppress kernel
boot parameter.

I suppose you could also tweak the clock code to pretend that no time
passes while halted, but that could be a very tricky task.

Thanx, Paul