> [good ideas trimmed]
For my part of my Master's Thesis work I implemented (in Linux) a
prefetching scheme similar to the above, although it's currently
limited to prefetching from swap devices -- not from mmap'd files.
The thesis is primarily an evaluation of a page replacement scheme
that works by detecting sequential memory access patterns -- by
observing sequential page faults -- and then performing MRU
replacement on these sequentially-accessed regions. We get pretty
good speedups (50-100% not uncommon) for a number of iterative
programs, particularly programs that iterate over a single, large
block of memory. (We were surprised by the number of programs that
behave this way, especially image processing programs: the canonical
thing to do seems to be to iterate a few times over an in-core image
array.) We also found that doing prefetching from the swap device
sped up a lot of programs substantially; the SCSI layers do a very
nice job of combining multiple disk requests, and often we were
reading ~64KB off the disk in one shot.
Anyway, if anybody wants to extend what I've done to do prefetching on
sequentially-accessed mmap'd files, I'm sure it wouldn't be hard.
Filling in PTEs as Mark suggested would also be possible using the
sequence detection code.
My kernel code is kind of a mess right now, having suffered through a
failed last-minute experiment where I added some stuff to the page
table entry format. Also I'm away from my linux machine for the
summer, so I can't do any real work on it. However, if anyone is
interested, I'd be happy to try to put a patch together.
Obligatory plug: my thesis and a paper with (early) simulation
results are available at ftp://bark.cs.wisc.edu/pub/gid/papers/ .
cheers,
Gideon Glass