[PATCH 38/46] fs: prefetch inode data in dcache lookup

From: Nick Piggin
Date: Sat Nov 27 2010 - 05:32:12 EST


This gains another 5% or so on the cached git diff workload by
prefetching the important first cacheline of the inode in while
we do the actual name compare and other operations on the dentry.

There was no measurable slowdown in the single file stat case, or
the creat case (where negative dentries would be common). (actually
there was about a 5 nanosecond speedup in these cases, but I can't
say it is significant.

Workload is 100 git diffs in sequence:
real user sys

vanilla single thread
0m9.753s 0m1.860s 0m7.230s
0m9.752s 0m1.960s 0m7.270s
0m9.754s 0m1.870s 0m7.290s
0m9.749s 0m1.910s 0m7.330s
0m9.750s 0m2.110s 0m7.060s

scale single thread
0m7.678s 0m1.990s 0m5.090s
0m7.682s 0m2.090s 0m5.000s
0m7.681s 0m1.970s 0m5.100s
0m7.679s 0m1.810s 0m5.280s
0m7.679s 0m1.970s 0m5.100s

Single threaded case has about 25% higher throughput. The actual
kernel's throughput is increased by about 45%. This is incredibly
significant for a single threaded performance increase in core
kernel code in 2010.

vanilla multi thread (preloadindex=true)
0m6.517s 0m1.430s 0m20.200s
0m6.514s 0m1.360s 0m20.230s
0m6.521s 0m1.410s 0m20.090s
0m6.519s 0m1.410s 0m20.060s
0m6.521s 0m1.610s 0m20.140s

scale multi thread (preloadindex=true)
0m3.301s 0m0.840s 0m3.300s
0m3.304s 0m0.940s 0m3.320s
0m3.291s 0m0.930s 0m3.170s
0m3.292s 0m0.900s 0m3.230s
0m3.277s 0m0.770s 0m3.230s

Parallel case throughput is very nearly doubled, despite git being
unable to produce enough work to keep all CPUs busy (118% CPU used
over the duration of the test). System time shows that scalability
of path walk has already turned to shit in the vanilla kernel.

Signed-off-by: Nick Piggin <npiggin@xxxxxxxxx>
---
fs/dcache.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 58faf37..fa6e7a5 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1658,6 +1658,9 @@ seqretry:
tlen = dentry->d_name.len;
tname = dentry->d_name.name;
i = dentry->d_inode;
+ prefetch(tname);
+ if (i)
+ prefetch(i);
/*
* This seqcount check is required to ensure name and
* len are loaded atomically, so as not to walk off the
--
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/