Re: x86, ARM, PARISC, PPC, MIPS and Sparc folks please run this

From: Jamie Lokier
Date: Mon Sep 01 2003 - 00:04:00 EST


Andi Kleen wrote:
> > I already got a surprise (to me): my Athlon MP is much slower
> > accessing multiple mappings which are within 32k of each other, than
> > mappings which are further apart, although it is coherent. The L1
>
> Most x86 and probably most other modern CPUs have virtually
> addressed L1 caches. It's just too slow to wait for the MMU for an
> L1 access which is really critical.
>
> So such artifacts are expected

I hadn't thought at first because there's no artefact at all (not even
a small one) on my Celeron, but you're right. They don't appear on
any Intels(*), but they do on all AMDs that I have results for.

(*) With the possible exception of one P4 that reports varying results.

>
> > data cache is 64k. (The explanation is easy: virtually indexed,
> > physically tagged cache moves data among cache lines, possibly via L2).
>
> On x86 L2 is usually physically tagged.

I'm speculating that L1 is physically tagged, and when there's a
virtual alias the CPU moves data from one L1 line to another. L2 only
comes into it because the line transfer is slow enough that a
MESI-style transfer through L2 (as if another CPU or device requested
the line) would account for the slowness.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/