On 22/03/2018 17:05, Matthew Wilcox wrote:
On Thu, Mar 22, 2018 at 04:54:52PM +0100, Laurent Dufour wrote:Good point, I missed that...
On 22/03/2018 16:40, Matthew Wilcox wrote:Right; but it must return one or the other, it can't segfault.
On Thu, Mar 22, 2018 at 04:32:00PM +0100, Laurent Dufour wrote:I'm wondering if such a complexity is required.
Regarding the page fault, why not relying on the PTE locking ?The page fault handler will walk the VMA tree to find the correct
When munmap() will unset the PTE it will have to held the PTE lock, so this
will serialize the access.
If the page fault occurs before the mmap(MAP_FIXED), the page mapped will be
removed when mmap(MAP_FIXED) would do the cleanup. Fair enough.
VMA and then find that the VMA is marked as deleted. If it assumes
that the VMA has been deleted because of munmap(), then it can raise
SIGSEGV immediately. But if the VMA is marked as deleted because of
mmap(MAP_FIXED), it must wait until the new VMA is in place.
If the user space process try to access the page being overwritten through
mmap(MAP_FIXED) by another thread, there is no guarantee that it will
manipulate the *old* page or *new* one.
The PTE locking should prevent that.I'd think this is up to the user process to handle that concurrency.Yes, and allowing the fault handler to return the *old* page risks the
What needs to be guaranteed is that once mmap(MAP_FIXED) returns the old page
are no more there, which is done through the mmap_sem and PTE locking.
old page being reinserted into the page tables after the unmapping task
has done its work.
It's *really* rare to page-fault on a VMA which is in the middle ofI was not trying to optimize it, but to not wait in the page fault handler.
being replaced. Why are you trying to optimise it?
This could become tricky in the case the VMA is removed once mmap(MAP_FIXED) is
done and before the waiting page fault got woken up. This means that the
removed VMA structure will have to remain until all the waiters are woken up
which implies ref_count or similar.
I can't think of another reason. I suppose we could mark the VMA asI think I was wrong to describe VMAs as being *deleted*. I think weSuch a lock would only makes sense in the case of mmap(MAP_FIXED) since when
instead need the concept of a *locked* VMA that page faults will block on.
Conceptually, it's a per-VMA rwsem, but I'd use a completion instead of
an rwsem since the only reason to write-lock the VMA is because it is
being deleted.
the VMA is removed there is no need to wait. Isn't it ?
locked-for-deletion or locked-for-replacement and have the SIGSEGV happen
early. But I'm not sure that optimising for SIGSEGVs is a worthwhile
use of our time. Just always have the pagefault sleep for a deleted VMA.