Re: [PATCH 2/2] perf_counter: optimize context switch betweenidentical inherited contexts

From: Paul Mackerras
Date: Fri May 22 2009 - 06:08:41 EST


Peter Zijlstra writes:

> On Fri, 2009-05-22 at 14:27 +1000, Paul Mackerras wrote:
> >
> > -----Unmodified----- With this patch series
> > Counters: none 2 HW 4H+4S none 2 HW 4H+4S
> >
> > 2 processes:
> > Average 3.44 6.45 11.24 3.12 3.39 3.60
> > St dev 0.04 0.04 0.13 0.05 0.17 0.19
> >
> > 8 processes:
> > Average 6.45 8.79 14.00 5.57 6.23 7.57
> > St dev 1.27 1.04 0.88 1.42 1.46 1.42
> >
> > 32 processes:
> > Average 5.56 8.43 13.78 5.28 5.55 7.15
> > St dev 0.41 0.47 0.53 0.54 0.57 0.81
>
> Any clues as to why the time is still dependent on the number of
> counters in the context? The approach seems to be O(1) in that it
> does a simple counter context swap on sched_out and nothing on sched_in.

Only the context switches between the lat_ctx processes will be
optimized; switches between them and other processes will still do the
full PMU switch. The CPU would be switching to other processes
from time to time during the run (e.g. to run various kernel daemons,
which seem to continue to proliferate) so some fraction of the context
switches would be as expensive as before. Probably I should measure
what that fraction is, but it's Friday night and I'm feeling lazy. :)

Paul.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/