On Tue, 19 Feb 2002, Daniel Phillips wrote:
> >
> > Which implies that the swapper needs to look up all mm's some way anyway,
>
> Ick. With rmap this is straightforward, but without, what?
It is not at ALL straightforward with rmap either.
Remember: one of the big original _points_ of the pmd sharing was to avoid
having to do the rmap overhead for shared page tables. The fact that it
works without rmap too was just a nice bonus, and makes apples-to-apples
comparisons possible.
So if you do the rmap overhead even when sharing, you're toast. No more
shared pmd's.
> Maybe page tables should be unshared on swapin/out after all, only on arches
> that need special tlb treatment, or until we have rmap.
There is no "or until we have rmap". It doesn't help. All the same issues
hold - if you have to invalidate multiple mm's, you have to find them all.
That's the same whether you have rmap or not, and is a fundamental issue
with sharing pmd's.
Dang, I should have noticed before this.
Note that "swapin" is certainly not the problem - we don't need to swap
the thing into all mm's at the same time, so if a unshare happens just
before/after the swapin and the unshared process doesn't get the thing,
we're still perfectly fine.
In fact, swapin is not even a spacial case. It's just the same as any
other page fault - we can continue to share page tables over "read-only"
page faults, and even that is _purely_ an optimization (yeah, it needs
some trivial "cmpxchg()" magic on the pmd to work, but it has no TLB
invalidation issues or anything really complex like that).
The only problem is swapout. And "swapout()" is always a problem, in fact.
It's always been special, because it is quite fundamentally the only VM
operation that ever is "nonlocal". We've had tons of races with swapout
over time, it's always been the nastiest VM operation by _far_ when it
comes to page table coherency.
We can, of course, introduce a "pmd-rmap" thing, with a pointer to a
circular list of all mm's using that pmd inside the "struct page *" of the
pmd. Right now the rmap patches just make the pointer point directly to
the one exclusive mm that holds the pmd, right?
(This could be a good "gradual introduction to some of the rmap data
structures" thing too).
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sat Feb 23 2002 - 21:00:17 EST