On Sun, 2005-11-06 at 19:20 +1100, Nick Piggin wrote:
2/14
plain text document attachment (mm-pte-prefetch.patch)
Prefetch ptes a line ahead. Worth 25% on ia64 when doing big forks.
Index: linux-2.6/include/asm-generic/pgtable.h
===================================================================
--- linux-2.6.orig/include/asm-generic/pgtable.h
+++ linux-2.6/include/asm-generic/pgtable.h
@@ -196,6 +196,33 @@ static inline void ptep_set_wrprotect(st
})
#endif
+#ifndef __HAVE_ARCH_PTE_PREFETCH
+#define PTES_PER_LINE (L1_CACHE_BYTES / sizeof(pte_t))
+#define PTE_LINE_MASK (~(PTES_PER_LINE - 1))
+#define ADDR_PER_LINE (PTES_PER_LINE << PAGE_SHIFT)
+#define ADDR_LINE_MASK (~(ADDR_PER_LINE - 1))
+
+#define pte_prefetch(pte, addr, end) \
+({ \
+ unsigned long __nextline = ((addr) + ADDR_PER_LINE) & ADDR_LINE_MASK; \
+ if (__nextline < (end)) \
+ prefetch(pte + PTES_PER_LINE); \
+})
+
are you sure this is right? at least on pc's having a branch predictor
miss is very expensive and might well be more expensive than the gain
you get from a prefetch