Re: [PATCH 1/2] perf/x86/intel: enable CPU ref_cycles for GP counter
From: Peter Zijlstra
Date: Tue May 30 2017 - 12:29:30 EST
On Tue, May 30, 2017 at 06:51:28AM -0700, Andi Kleen wrote:
> On Tue, May 30, 2017 at 11:25:23AM +0200, Peter Zijlstra wrote:
> > On Sun, May 28, 2017 at 01:31:09PM -0700, Stephane Eranian wrote:
> > > Ultimately, I would like to see the watchdog move out of the PMU. That
> > > is the only sensible solution.
> > > You just need a resource able to interrupt on NMI or you handle
> > > interrupt masking in software as has
> > > been proposed on LKML.
> >
> > So even if we do the soft masking, we still need to deal with regions
> > where the interrupts are disabled. Once an interrupt hits the soft mask
> > we still hardware mask.
> >
> > So to get full and reliable coverage we still need an NMI source.
>
> You would only need a single one per system however, not one per CPU.
> RCU already tracks all the CPUs, all we need is a single NMI watchdog
> that makes sure RCU itself does not get stuck.
>
> So we just have to find a single watchdog somewhere that can trigger
> NMI.
But then you have to IPI broadcast the NMI, which is less than ideal.
RCU doesn't have that problem because the quiescent state is a global
thing. CPU progress, which is what the NMI watchdog tests, is very much
per logical CPU though.
> > I agree that it would be lovely to free up the one counter though.
>
> One option is to use the TCO watchdog in the chipset instead.
> Unfortunatley it's not an universal solution because some BIOS lock
> the TCO watchdog for their own use. But if you have a BIOS that
> doesn't do that it should work.
I suppose you could also route the HPET to the NMI vector and other
similar things. Still, you're then stuck with IPI broadcasts, which
suck.
> > One other approach is running the watchdog off of _any_ PMI, then all we
> > need to ensure is that PMIs happen semi regularly. There are two cases
> > where this becomes 'interesting':
>
> Seems fairly complex.
Yes.. :/