Re: [PATCH 0/9] Hugepage migration (v2)

From: Naoya Horiguchi
Date: Tue Aug 17 2010 - 04:20:23 EST

Next message: Tejun Heo: "Re: [PATCH 2/5] virtio_blk: implement REQ_FLUSH/FUA support"
Previous message: Lin Ming: "selinux build error -âFILE__AUDIT_ACCESSâ undeclared"
In reply to: Naoya Horiguchi: "Re: [PATCH 0/9] Hugepage migration (v2)"
Next in thread: Andi Kleen: "Re: [PATCH 0/9] Hugepage migration (v2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Aug 17, 2010 at 11:37:19AM +0900, Naoya Horiguchi wrote:
> On Mon, Aug 16, 2010 at 07:19:58AM -0500, Christoph Lameter wrote:
> > On Mon, 16 Aug 2010, Naoya Horiguchi wrote:
> >
> > > In my understanding, in current code "other processors increasing refcount
> > > during migration" can happen both in non-hugepage direct I/O and in hugepage
> > > direct I/O in the similar way (i.e. get_user_pages_fast() from dio_refill_pages()).
> > > So I think there is no specific problem to hugepage.
> > > Or am I missing your point?
> >
> > With a single page there is the check of the refcount during migration
> > after all the references have been removed (at that point the page is no
> > longer mapped by any process and direct iO can no longer be
> > initiated without a page fault.
>
> The same checking mechanism works for hugeapge.

So, my previous comment below was not correct:

>>> This patch only handles migration under direct I/O.
>>> For the opposite (direct I/O under migration) it's not true.
>>> I wrote additional patches (later I'll reply to this email)
>>> for solving locking problem. Could you review them?

The hugepage migration patchset should work fine without the
additional page locking patch.
Please ignore the additional page locking patch-set
and review the hugepage migration patch-set only.
Sorry for confusion.

I explain below why the page lock in direct I/O is not needed to avoid
race with migration. This is true for both hugepage and non-huge page.

Race between page migration and direct I/O is in essense the one between
try_to_unmap() in unmap_and_move() and get_user_pages_fast() in dio_get_page().

When try_to_unmap() is called before get_user_pages_fast(),
all ptes pointing to the page to be migrated are replaced to migration
swap entries, so direct I/O code experiences page fault.
In the page fault, the kernel finds migration swap entry and waits the page lock
(which was held by migration code before try_to_unmap()) to be unlocked
in migration_entry_wait(), so direct I/O blocks until migration completes.

When get_user_pages_fast() is called before try_to_unmap(),
direct I/O code increments refcount on the target page.
Because this refcount is not associated to the mapping,
migration code will find remaining refcounts after try_to_unmap()
unmaps all mappings. Then refcount check decides migration to fail,
so direct I/O is continued safely.

Thanks,
Naoya Horiguchi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Tejun Heo: "Re: [PATCH 2/5] virtio_blk: implement REQ_FLUSH/FUA support"
Previous message: Lin Ming: "selinux build error -âFILE__AUDIT_ACCESSâ undeclared"
In reply to: Naoya Horiguchi: "Re: [PATCH 0/9] Hugepage migration (v2)"
Next in thread: Andi Kleen: "Re: [PATCH 0/9] Hugepage migration (v2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]