Pentium SSE prefetcht0 instruction... How do you make it work

From: Bulent Abali (
Date: Thu Sep 27 2001 - 13:40:55 EST

I have a system with a large external L3 cache. Uses Pentium III
Coppermine (stepping 3) processor. L3 block size is relatively long. I
am trying to increase the L3 performance by prefetching L3 lines. I
thought prefetcht0, t1, t2, nta instructions may help. These instructions
prefetch L1/L2 lines in to the processor therefore they should prefetch L3
as well. However I see no benefit. It is as if prefetch never make it to
the front side bus. I took the example of arch/i386/lib/mmx.c to implement
the asm level routines. Perhaps I am overlooking something. I'd
appreciate any suggestions. /Bulent

//This should prefetch an L2 line at addr (hence L3 line prefetch)
inline void L3_prefetch (char * addr)
     asm volatile("prefetcht1 %0" :: "m" (addr));

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Sun Sep 30 2001 - 21:00:57 EST