Re: Anticipatory prefaulting in the page fault handler V1

From: Jesse Barnes
Date: Wed Dec 08 2004 - 12:34:16 EST


On Wednesday, December 8, 2004 9:24 am, Christoph Lameter wrote:
> Page fault scalability patch and prefaulting. Max prefault order
> increased to 5 (max preallocation of 32 pages):
>
> Gb Rep Threads User System Wall flt/cpu/s fault/wsec
> 256 10 8 33.571s 4516.293s 863.021s 36874.099 194356.930
> 256 10 16 33.103s 3737.688s 461.028s 44492.553 363704.484
> 256 10 32 35.094s 3436.561s 321.080s 48326.262 521352.840
> 256 10 64 46.675s 2899.997s 245.020s 56936.124 684214.256
> 256 10 128 85.493s 2890.198s 203.008s 56380.890 826122.524
> 256 10 256 74.299s 1374.973s 99.088s115762.963 1679630.272
> 256 10 512 62.760s 706.559s 53.027s218078.311 3149273.714
>
> We are getting into an almost linear scalability in the high end with
> both patches and end up with a fault rate > 3 mio faults per second.

Nice results! Any idea how many applications benefit from this sort of
anticipatory faulting? It has implications for NUMA allocation. Imagine an
app that allocates a large virtual address space and then tries to fault in
pages near each CPU in turn. With this patch applied, CPU 2 would be
referencing pages near CPU 1, and CPU 3 would then fault in 4 pages, which
would then be used by CPUs 4-6. Unless I'm missing something...

And again, I'm not sure how important that is, maybe this approach will work
well in the majority of cases (obviously it's a big win in faults/sec for
your benchmark, but I wonder about subsequent references from other CPUs to
those pages). You can look at /sys/devices/platform/nodeN/meminfo to see
where the pages are coming from.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/