Re: [PATCH 0/3] mm,vdso: preallocate new vmas

From: Michel Lespinasse
Date: Tue Oct 22 2013 - 13:04:18 EST


On Tue, Oct 22, 2013 at 9:20 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Oct 22, 2013 at 4:48 PM, <walken@xxxxxxxxxx> wrote:
>> Generally the problems I see with mmap_sem are related to long latency
>> operations. Specifically, the mmap_sem write side is currently held
>> during the entire munmap operation, which iterates over user pages to
>> free them, and can take hundreds of milliseconds for large VMAs.
>
> So this would be the *perfect* place to just downgrade the semaphore
> from a write to a read.

It's not as simple as that, because we currently rely on mmap_sem
write side being held during page table teardown in order to exclude
things like follow_page() which may otherwise access page tables while
we are potentially freeing them.

I do think it's solvable, but it gets complicated fast. Hugh & I have
been talking about it; the approach I'm looking at would involve
unwiring the page tables first (under protection of the mmap_sem write
lock) and then iterating on the unwired page tables to free the data
pages, issue TLB shootdowns and free the actual page tables (we
probably don't need even the mmap_sem read side at that point). But,
that's nowhere like a 10 line change anymore at that point...

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/