Re: mmap() is slower than read() on SCSI/IDE on 2.0 and 2.1

Kurt Garloff (garloff@kg1.ping.de)
Tue, 15 Dec 1998 09:31:22 +0100


On Tue, Dec 15, 1998 at 01:37:58AM +0000, Jamie Lokier wrote:
> I see a misunderstanding here. Jay refers to pre-fetching (from disk),
> while Dave refers to pre-faulting (setting up page tables and waiting
> for the page data to be present).
>
> These are very different.
>
> > The problematic case (and a real life one) is when all of libc has
> > been faulted into main memory by various processes, when you start one
> > up do you map in all of libc when it gets mmap'd by the application?
> > If not, then which if any pages do you choose?
>
> You don't map any more than you do know. What you _do_ do is readahead
> pages into the page cache, based on when pages are mapped into the
> process. If you just map the lot, (a) it's inefficient (as you know);
> (b) you don't get told when to initiate further asynchronous readahead
> _before_ the pages are needed synchronously.
>
> Anyway, don't the latest VM clustered page-in/swap-in changes have some
> of the sort of effect we're talking about? However, I don't think they
> implement genuine predictive readahead, which could be done like this:
>
> [Example of readahead mmaped data]
>
> See the difference? This way, the process _never_ blocks waiting for data.
> The existing heuristics don't have this property, as far as I know.

This looks very nice. It will boost performance for sequential - like access
patterns.
(BTW: The minor and major page faults reported by /usr/bin/time are the
number of times the kernel had to map a page or to read it, resp., right?)

The problem described is that binaries/libraries are mmaped for execution.
The normal pattern in binaries is far from sequential, as we use a lot of
functions/procedures. (On RISCs it's a little bit more sequential, but still
bad).
Here, your readahead strategy would consume too much memory and I/O time.

Now, the question is how the kernel could identify those both cases. Maybe
just from the fact whether the pages are executable or not? Maybe madvise() ?
[If DaveM is against it, he has certainly reasons for that. I don't see a
reason too not tell it, however.] Maybe something like a page fault history?

I think the executable or not would be a good enough heuristics, but I don't
know if the VM layer in question has this info. The page fault history is
the most general approach, as it adapts to the pattern actually used, but
it's not that easy to do ...

-- 
Kurt Garloff <K.Garloff@ping.de>  (Dortmund, FRG)
PGP key on http://student.physik.uni-dortmund.de/homepages/garloff

There is something frustrating about the quality and speed of Linux development. I.e. the quality is too high and the speed is too high, in other words, I can implement this XXXX feature, but I bet someone else has already done it and is just about to release his patch to Linus soon... [From a posting of Tigran Aivazian to linux-kernel, XXXX = disk stat]

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/