On Sat, 16 Sep 2006, Nick Piggin wrote:
... but the problem is still fundamentally COW.
Well, yes, we wouldn't have all these problems if we didn't have
to respect COW. But generally a process can, one way or another,
make sure it won't get into those problems: Yingchao is concerned
with the way the TestSetPageLocked unpredictably upsets correctness.
I'd say it's a more serious error than the general problems with COW.
In other words, one should always be able to return 0 from that
can_share_swap_page and have the system continue to work... right?
Because even if you hadn't done that mprotect trick, you may still
have a problem because the page may *have* to be copied on write
if it is shared over fork.
Most processes won't fork, and exec has freed them from sharing
their parents pages, and their private file mappings aren't being
used as buffers. Maybe Yingchao will later have to worry about
those cases, but for now it seems not.
So if we filled in the missing mm/ implementation of VM_DONTCOPY
(and call it MAP_DONTCOPY rather than the confusing MAP_DONTFORK)
such that it withstands such an mprotect sequence, we can then ask
that all userspace drivers do their get_user_pages memory on these
types of vmas.
(madvise MADV_DONTFORK)
For the longest time I couldn't understand you there at all, perhaps
distracted by your parenthetical line: at last I think you're proposing
we tweak mprotect to behave differently on a VM_DONTCOPY area.
But differently in what way? Allow it to ignore Copy-On-Write?