Re: missing madvise functionality

From: Nick Piggin
Date: Wed Apr 04 2007 - 05:46:05 EST

Next message: Geert Uytterhoeven: "Re: [PATCH] Stop pmac_zilog from abusing 8250's device numbers."
Previous message: Ingo Molnar: "Re: 2.6.21-rc5-rt10 troubles"
In reply to: Eric Dumazet: "Re: missing madvise functionality"
Next in thread: Nick Piggin: "Re: missing madvise functionality"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Eric Dumazet wrote:

On Wed, 04 Apr 2007 18:55:18 +1000
Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:

Peter Zijlstra wrote:

On Wed, 2007-04-04 at 12:22 +1000, Nick Piggin wrote:

Eric Dumazet wrote:

I do think such workloads might benefit from a vma_cache not shared by all threads but private to each thread. A sequence could invalidate the cache(s).

ie instead of a mm->mmap_cache, having a mm->sequence, and each thread having a current->mmap_cache and current->mm_sequence

I have a patchset to do exactly this, btw.

/me too

However, I decided against pushing it because when it does happen that a
task is not involved with a vma lookup for longer than it takes the seq
count to wrap we have a stale pointer...

We could go and walk the tasks once in a while to reset the pointer, but
it all got a tad involved.

Well here is my core patch (against I think 2.6.16 + a set of vma cache
cleanups and abstractions). I didn't think the wrapping aspect was
terribly involved.

Well, I believe this one is too expensive. I was thinking of a light one :

I am not deleting mmap_sem, but adding a sequence number to mm_struct, that is incremented each time a vma is added/deleted, not each time mmap_sem is taken (read or write)

That's exactly what mine does (except IIRC it doesn't invalidate when
you add a vma).

Each thread has its own copy of the sequence, taken at the time find_vma() had to do a full lookup.

I believe some optimized paths could call check_vma_cache() without mmap_sem read lock taken, and if it fails, take the mmap_sem lock and do the slow path.

The mmap_sem for read does not only protect the mm_rb rbtree structure, but
the vmas themselves as well as their page tables, so you can't do that.

You could do it if you had a lock-per-vma to synchronise against write
operations, and rcu-freed vmas or some such... but I don't think we should
go down a road like that until we first remove mmap_sem from low hanging
things (like private futexes!) and then see who's complaining.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Geert Uytterhoeven: "Re: [PATCH] Stop pmac_zilog from abusing 8250's device numbers."
Previous message: Ingo Molnar: "Re: 2.6.21-rc5-rt10 troubles"
In reply to: Eric Dumazet: "Re: missing madvise functionality"
Next in thread: Nick Piggin: "Re: missing madvise functionality"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]