Re: x86's nmi_hz wrt. oprofile's nmi_timer_int.c
From: Ingo Molnar
Date: Tue Feb 03 2009 - 07:28:08 EST
* David Miller <davem@xxxxxxxxxxxxx> wrote:
> From: David Miller <davem@xxxxxxxxxxxxx>
> Date: Fri, 30 Jan 2009 13:54:09 -0800 (PST)
>
> > From: Ingo Molnar <mingo@xxxxxxx>
> > Date: Fri, 30 Jan 2009 16:01:25 +0100
> >
> > >
> > > * David Miller <davem@xxxxxxxxxxxxx> wrote:
> > >
> > > Reducing it to 1 HZ was kind of a performance hack: running NMIs at HZ
> > > needlessly interrupts the CPU HZ times a second. It's more than enough to
> > > have 1 nmi-watchdog tick per second to notice deadlocks that take longer
> > > than 5 seconds.
> >
> > For the NMI watchdog's purposes I understand the intent, and this
> > is perfectly fine.
> >
> > The problem is that it stays at '1' when oprofile starts using the NMI
> > watchdog, and we certainly want more than one oprofile tick per second
> > :-)
>
> Just making sure you understand the problem, here is the
> sequence of events:
>
> 1) At bootup, the NMI watchdog is tested.
>
> It is tested with nmi_hz=HZ
>
> 2) If the test passes, nmi_hz is reduced down to '1'
>
> As I stated, everything up to this point is fine. Next:
>
> 3) oprofile initializes and if we choose to use the NMI
> timer for oprofile profiling it is implemented using
> a simple DIE_NMI notifier.
>
> However, nmi_hz is still just '1' which means that oprofile
> will only receive one sample per-second. And this is definitely
> not what we want.
>
> Somehow the code in arch/x86/oprofile/nmi_timer_int.c needs to have an
> interface into the NMI watchdog core so that it can increase nmi_hz back
> up to "HZ" when the NMI timer profiling is enabled and back down to "1"
> when such profiling stops.
btw., these types of interactions will be solved in a natural way via
perfcounters: in that model the NMI watchdog is a set of per-CPU counters
running on each CPU [with a NMI watchdog callback in the IRQ handling
routine] - and oprofile uses its own perfcounter - what is left of the PMU
hardware.
I.e. each PMU using facility can just use performance counters transparently
and interactions will be solved naturally by perfcounters resource
management.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/