Re: [RFC PATCH] perf: Add load latency monitoring on IntelNehalem/Westmere

From: Peter Zijlstra
Date: Thu Dec 23 2010 - 05:51:11 EST


On Thu, 2010-12-23 at 11:31 +0100, Stephane Eranian wrote:
> On Thu, Dec 23, 2010 at 11:18 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Thu, 2010-12-23 at 16:59 +0800, Lin Ming wrote:
> >> > {L1, L2, L3, RAM}x{snoop, local, remote}x{shared, exclusive} + {unknown,
> >> > uncached, IO}
> >> >
> >> > Which takes all of 5 bits to encode.
> >>
> >> Do you mean below encoding?
> >>
> >> bits4 3 2 1 0
> >> + + + + +
> >> | | | | |
> >> | | | {L1, L2, L3, RAM} or {unknown, uncached, IO}
> >> | | |
> >> | {snoop, local, remote, OTHER}
> >> |
> >> {shared, exclusive}
> >>
> >> If bits(2-3) is OTHER, then bits(0-1) is the encoding of {unknown,
> >> uncached, IO}.
> >
> > That is most certainly a very valid encoding, and a rather nice one at
> > that. I hadn't really gone further than: 4*3*2 + 3 < 2^5 :-)
> >
> > If you also make OTHER=0, then a valid encoding for unknown is also 0,
> > which is a nice meaning for 0...
> >
> I am not sure how you would cover the 9 possibilities for data source as
> shown in Table 10-13 using this encoding. Could you show me?

Ah, I think I see the problem, there's multiple L3-snoops, I guess we
can fix that by extending the {shared, exclusive} to full MESI, growing
us to 6 bits.

I'm assuming you mean "Table 30-13. Data Source Encoding for Load
Latency Record", which has 14 values defined.

Value Intel Perf
0x0 Unknown L3 Unknown

0x1 L1 L1-local

0x2 Pending core cache HIT L2-snoop
Outstanding core cache miss to
the same line was underway
0x3 L2 L2-local

0x4 L3-snoop, no coherency actions L3-snoop-I
0x5 L3-snoop, found no M L3-snoop-S
0x6 L3-snoop, found M L3-snoop-M

0x8 L3-miss, snoop, shared RAM-snoop-S
0xA L3-miss, local, shared RAM-local-S
0xB L3-miss, remote, shared RAM-remote-S

0xC L3-miss, local, exclusive RAM-local-E
0xD L3-miss, remote, exclusive RAM-remote-E

0xE IO IO
0xF uncached uncached


Leaving us with:

{L1, L2, L3, RAM}x{snoop, local, remote}x{modified, exclusive, shared, invalid} + {unknown, uncached, IO}

Now the question is, is this sufficient to map all data sources from
other archs as well?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/