Re: [PATCH 10/17] 2.6.17.1 perfmon2 patch for review: PMU context switch

From: Stephane Eranian
Date: Fri Jun 30 2006 - 14:49:36 EST


Chuck,

On Fri, Jun 30, 2006 at 02:33:49PM -0400, Chuck Ebbert wrote:
> In-Reply-To: <200606301541.22928.ak@xxxxxxx>
>
> On Fri, 30 Jun 2006 15:41:22 +0200, Andi Kleen wrote:
>
> > > So why do we need care about context switch in cpu-wide mode?
> > > It is because we support a mode where the idle thread is excluded
> > > from cpu-wide monitoring. This is very useful to distinguish
> > > 'useful kernel work' from 'idle'.
> >
> > I don't quite see the point because on x86 the PMU doesn't run
> > during C states anyways. So you get idle excluded automatically.
>
> Looks like it does run:
>
> $ pfmon -ecpu_clk_unhalted,interrupts_masked_cycles -k --system-wide -t 10
> <session to end in 10 seconds>
> CPU0 60351837 CPU_CLK_UNHALTED
> CPU0 346548229 INTERRUPTS_MASKED_CYCLES
>
> The CPU spent ~60 million clocks unhalted and ~350 million with interrupts
> disabled. (This is an idle 1.6GHz Turion64 machine.)
>
I think it stops counting only CERTAIN events. That's where it becomes difficult
to interpret. There was a defect like this on Itanium 2.

> Now let's see what happens when we exclude the idle thread:
>
> $ pfmon -ecpu_clk_unhalted,interrupts_masked_cycles -k --system-wide -t 10 --excl-idle
> <session to end in 10 seconds>
> CPU0 449250 CPU_CLK_UNHALTED
> CPU0 161577 INTERRUPTS_MASKED_CYCLES
>
> Looks like excluding the idle thread means interrupts that happen while idle
> don't get counted either. We took 5000 clock interrupts and I know they take
> longer than that to process.
>

Yes, the way it is implemented today means that nothing happening while the idle
threads runs/sleeps is counted. Monitoring is disabled at context switch boundaries
for task->pid==0.

I would rather have something that would stop counting *EVERYTHING* ONLY when
going low-power OR when busy looping. That's why the idle notififers that Andi mentioned
are interesting. But this would not cover interrupts received while in idle and yet we
want to measure those as they represent useful kernel work. This is unless interrupts while
idle qualifies for an exit_idle() notification.

--
-Stephane
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/