Re: [next-20180517][ppc] watchdog: CPU 88 self-detected hard LOCKUP @ update_cfs_group+0x30/0x150

From: Abdul Haleem
Date: Tue May 29 2018 - 09:11:20 EST


On Mon, 2018-05-21 at 16:50 +1000, Nicholas Piggin wrote:
> Ah, it's POWER8.
>
> I'm betting we have a bug with nohz timer offloading somewhere.
>
> I *think* we may have seen similar on P9 as well, but that may be
> related to problems with stop states.
>
> Can you reproduce it easily? I'm thinking maybe adding some
> tracepoints that track decrementer settings and interrupts, and
> nohz offload activity might show something up.

Yes, the problem is reproducible consistently on our CI setup and today
It triggered on 4.17.0-rc6 (mainline) too.

--
Regard's

Abdul Haleem
IBM Linux Technology Centre