Re: [RFC] Page table sharing

From: Hugh Dickins (hugh@veritas.com)
Date: Sun Feb 17 2002 - 17:16:48 EST


On Sun, 17 Feb 2002, Daniel Phillips wrote:
>
> Note that we have to hold the page_table_share_lock until we're finished
> copying in the ptes, otherwise the source could go away. This can turn
> into a lock on the page table itself eventually, so whatever contention
> there might be will be eliminated.
>
> Fixing up copy_page_range to bring the pmd populate inside the
> mm->page_table_lock is trivial, I won't go into it here. With that plus
> changes as above, I think it's tight. Though I would not bet my life on
> it ;-)

Sorry, I didn't really try to follow your preceding discussion of
zap_page_range. (I suspect you need to step back and think again if it
gets that messy; but that may be unfair, I haven't thought it through).

You need your "page_table_share_lock" (better, per-page-table spinlock)
much more than you seem to realize. If mm1 and mm2 share a page table,
mm1->page_table_lock and mm2->page_table_lock give no protection against
each other. Consider copy_page_range from mm1 or __pte_alloc in mm1
while try_to_swap_out is acting on shared page table in mm2. In fact,
I think even the read faults are vulnerable to races (mm1 and mm2
bringing page in at the same time so double-counting it), since your
__pte_alloc doesn't regard a read fault as reason to break the share.

I'm also surprised that your copy_page_range appears to be setting
write protect on each pte, including expensive pte_page, VALID_PAGE
stuff on each. You avoid actually copying pte and incrementing counts,
but I'd expect you to want to avoid the whole scan: invalidating entry
for the page table itself, to force fault if needed.

Hugh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Feb 23 2002 - 21:00:13 EST