Re: madvise(2) MADV_SEQUENTIAL behavior

From: Chris Snook
Date: Wed Jul 16 2008 - 17:05:46 EST


Rik van Riel wrote:
On Wed, 16 Jul 2008 14:14:55 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

On Tue, 2008-07-15 at 23:03 +0000, Eric Rannaud wrote:
mm/madvise.c and madvise(2) say:

* MADV_SEQUENTIAL - pages in the given range will probably be accessed
* once, so they can be aggressively read ahead, and
* can be freed soon after they are accessed.


But as the sample program at the end of this post shows, and as I
understand the code in mm/filemap.c, MADV_SEQUENTIAL will only increase
the amount of read ahead for the specified page range, but will not
influence the rate at which the pages just read will be freed from
memory.
Correct, various attempts have been made to actually implement this, but
non made it through.

My last attempt was:
http://lkml.org/lkml/2007/7/21/219

Rik recently tried something else based on his split-lru series:
http://lkml.org/lkml/2008/7/15/465

M patch is not going to help with mmap, though.

I believe that for mmap MADV_SEQUENTIAL, we will have to do
an unmap-behind from the fault path. Not every time, but
maybe once per megabyte, unmapping the megabyte behind us.

That way the normal page cache policies (use once, etc) can
take care of page eviction, which should help if the file
is also in use by another process.


Wouldn't it just be easier to not move pages to the active list when they're referenced via an MADV_SEQUENTIAL mapping? If we keep them on the inactive list, they'll be candidates for reclaiming, but they'll still be in pagecache when another task scans through, as long as we're not under memory pressure.

-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/