Re: [PATCH 2/5] Swapless V2: Add migration swap entries

From: Andrew Morton
Date: Thu Apr 13 2006 - 21:02:45 EST


Christoph Lameter <clameter@xxxxxxx> wrote:
>
> On Thu, 13 Apr 2006, Andrew Morton wrote:
>
> > Christoph Lameter <clameter@xxxxxxx> wrote:
> > >
> > > On Thu, 13 Apr 2006, Andrew Morton wrote:
> > >
> > > > Christoph Lameter <clameter@xxxxxxx> wrote:
> > > > >
> > > > > +
> > > > > + if (unlikely(is_migration_entry(entry))) {
> > > >
> > > > Perhaps put the unlikely() in is_migration_entry()?
> > > >
> > > > > + yield();
> > > >
> > > > Please, no yielding.
> > > >
> > > > _especially_ no unchangelogged, uncommented yielding.
> > >
> > > Page migration is ongoing so its best to do something else first.
> >
> > That doesn't help a lot. What is "something else"? What are the dynamics
> > in there, and why do you feel that some sort of delay is needed?
>
> Page migration is ongoing for the page that was faulted. This means
> the migration thread has torn down the ptes and replaced them with
> migration entries in order to prevent access to this page. The migration
> thread is continuing the process of tearing down ptes, copying the page
> and then rebuilding the ptes. When the ptes are back then the fault
> handler will no longer be invoked or it will fix up some of the bits in
> the ptes. This takes a short time, the more ptes point to a page the
> longer it will take to replace them.

So we falsely return VM_FAULT_MINOR and let userspace retake the pagefault,
thus implementing a form of polling, yes? If so, there is no "something
else" which this process can do.

Pages are locked during migration. The faulting process will sleep in
lock_page() until migration is complete. Except we've gone and diddled
with the swap pte so do_swap_page() can no longer locate the page which
needs to be locked.

Doing a busy-wait seems a bit lame. Perhaps it would be better to go to
sleep on some global queue, poke that queue each time a page migration
completes?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/