Re: [PATCH 2/4] Add replace_page(), change the mapping of pte fromone page into another

From: Lee Schermerhorn
Date: Wed Nov 12 2008 - 17:09:24 EST


On Wed, 2008-11-12 at 14:27 -0600, Christoph Lameter wrote:
> On Wed, 12 Nov 2008, Andrea Arcangeli wrote:
>
> > On Tue, Nov 11, 2008 at 09:10:45PM -0600, Christoph Lameter wrote:
> > > get_user_pages() cannot get to it since the pagetables have already been
> > > modified. If get_user_pages runs then the fault handling will occur
> > > which will block the thread until migration is complete.
> >
> > migrate.c does nothing for ptes pointing to swap entries and
> > do_swap_page won't wait for them either. Assume follow_page in
>
> If a anonymous page is a swap page then it has a mapping.
> migrate_page_move_mapping() will lock the radix tree and ensure that no
> additional reference (like done by do_swap_page) is established during
> migration.

So, it's Nick's reference freezing you asked about in response to my
mail that prevents do_swap_page() from getting another reference on the
page in the swap cache just after migrate_page_move_mapping() checks the
ref count and replaces the slot with new swap pte. Radix tree lock just
prevents other threads from modifying the slot, right? [Hmmm, looks
like we need to update the reference to "write lock" in the comments on
the 'deref_slot() and _replace_slot() definitions in radix-tree.h.]

Therefore, do_swap_page() will either get the old page and raise the ref
before migration check, or it will [possibly loop in find_get_page() and
then] get the new page.

Migration will bail out, for this pass anyway, in the former case. In
the second case, do_swap_page() will wait on the new page lock until
migration completes, deferring any direct IO.

Or am I still missing something?

>
> > However it's not exactly the same bug as the one in fork, I was
> > talking about before, it's also not o_direct specific. Still
>
> So far I have seen wild ideas not bugs.

Maybe not so wild, given the complexity of these interactions...

Later,
Lee


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/