Re: [RFC PATCH] perf: Add load latency monitoring on IntelNehalem/Westmere

From: Lin Ming
Date: Thu Dec 23 2010 - 03:56:00 EST


On Wed, 2010-12-22 at 18:49 +0800, Peter Zijlstra wrote:
> On Wed, 2010-12-22 at 11:45 +0100, Peter Zijlstra wrote:
> > On Wed, 2010-12-22 at 11:08 +0100, Stephane Eranian wrote:
> > > Yes, I think there is more to it than just data source, unfortunately.
> > > If you want to avoid returning an opaque u64 (PERF_SAMPLE_EXTRA), then
> > > you need to break it down: PERF_SAMPLE_DATA_SRC, PERF_SAMPLE_XX
> > > and so on.
> >
> > I guess we can do things like:
> >
> > Satisfied by {L1, L2, L3, RAM}x{snoop, local, remote} + unknown, and
> > encode "Pending core cache HIT" as L2-snoop or something, whatever is
> > most appropriate.
>
> Ah, I just saw my email window covered part of the spec and we can also
> have x{shared,exclusive}, so we end up with:
>
> {L1, L2, L3, RAM}x{snoop, local, remote}x{shared, exclusive} + {unknown,
> uncached, IO}
>
> Which takes all of 5 bits to encode.

Do you mean below encoding?

bits4 3 2 1 0
+ + + + +
| | | | |
| | | {L1, L2, L3, RAM} or {unknown, uncached, IO}
| | |
| {snoop, local, remote, OTHER}
|
{shared, exclusive}

If bits(2-3) is OTHER, then bits(0-1) is the encoding of {unknown,
uncached, IO}.

>
> > But does that cover every architecture?
> >
> > Also, since that doesn't require more that 4 bits to encode, we could
> > try and categorize what else is around and try and create a well
> > specified _EXTRA register, I mean, we still got 60bits left after this.
>
> Leaving us with 59 bits to consider.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/