Re: missing madvise functionality

From: Eric Dumazet
Date: Tue Apr 03 2007 - 17:30:52 EST


Rik van Riel a Ãcrit :
Eric Dumazet wrote:
Rik van Riel a Ãcrit :
Andrew Morton wrote:

Oh. I was assuming that we'd want to unmap these pages from pagetables and
mark then super-easily-reclaimable. So a later touch would incur a minor
fault.

But you think that we should leave them mapped into pagetables so no such
fault occurs.

Leaving the pages mapped into pagetables means that they are considerably
less likely to be reclaimed.

If we move the pages to a place where they are very likely to be
reclaimed quickly (end of the inactive list, or a separate
reclaim list) and clear the dirty and referenced lists, we can
both reclaim the page easily *and* avoid the page fault penalty.


There is one possible speedup :

- If an user app does a madvise(MADV_DONTNEED), we can assume the pages can later be bring back without need to zero them. The application doesnt care.

... however, the application that previously used that page might
care a lot!

The application that does madvise(MADV_WHATEVER_MEANS_KENREL_CAN_DROP)
doesnt care. It it cares, it would use munmap(), or no syscall at all.


mmap()/brk() must give fresh NULL pages, but maybe madvise(MADV_DONTNEED) can relax this requirement (if the pages were reclaimed, then a page fault could bring a new page with random content)

If we bring in a new page, it has to be zeroed for security
reasons.

You don't want somebody else's process to get a page with
your password in it.

Then an application that cares of passwd wont use madvise(MADV_WHATEVER_MEANS_I_DONT_CARE)

;)

Maybe I was not clear, but I was refering to a pool of 'discardable' pages, that would be feeded by applications that want to notify kernel some pages can be completly discarded (contains no security data of course, nor data that the applications dont want to forget), and might be given to a consumer without the need of zeroing it.

We might make this pool private to each process, but then it would benefit to less workloads I guess...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/