Re: [RFC] arch hardlockup detector interfaces improvement

From: Don Zickus
Date: Fri May 19 2017 - 15:43:53 EST


On Sat, May 20, 2017 at 12:53:06AM +1000, Nicholas Piggin wrote:
> > I am curious to know what IBM thinks there. Currently the HARDLOCKUP
> > detector sits on top of perf. I get the impression, you are removing that
> > dependency. Is that a permanent thing or are you thinking of switching back
> > and forth depending on if SOFTLOCKUP is enabled or not?
>
> We want to get away from perf permanently.
>
> The PMU interrupts are not specially non-maskable from a hardware
> POV, everything gets masked when you turn off interrupts in hardware.
> powerpc arch code implements a software disable layer, and PMU
> interrupts are differentiated there by being allowed to run even
> under local_irq_disable();
>
> We have a few issues with using perf for it. We disable it by
> default because using it for that breaks another PMU feature.
>
> But PMU interupts are not special, so it would be possible to e.g.,
> take the timer interrupt before soft disable and have it touch the
> watchdog if it fires while under local_irq_disable(). That give
> exact same kind of pseudo-NMI as perf interrupts, without using PMU.
>
> Further, we now want to introduce a local_pmu_disable() type of
> interface that extends this soft disable layer to perf interrupts
> as well for some cases. Once we start doing that, more code will
> be exempt from the hardlockup watchdog, whereas a watchdog specific
> hook from the timer interrupt would still cover it.

Interesting. Thanks for that info.

Do you think you can split the patch in half for me? You are shuffling code
around and making subtle changes in the shuffled code. It is hard to double
check everything. Namely watchdog_nmi_reconfigure and watchdog_update_cpus
suddenly appear.

The first patch would be straight code movement, the second the real
changes. I almost don't care if you throw in the LOCKUP->SOFTLOCKUP changes
in the first patch either.

I am really interested in the subtle changes to make sure you covered the
various race conditions that we have uncovered over the years.

I also would like to see a sample of the watchdog_nmi_reconfigure() you had
in mind. Usually we hide all the nmi stuff underneath the watchdog
functions. But those are threaded, which is what you are trying to avoid,
so the placement of the function makes sense but looks odd.

Thanks!

Cheers,
Don