Re: [PATCH 0/3] Fix migration races in rmap_walk() V2

From: Mel Gorman
Date: Wed Apr 28 2010 - 11:24:18 EST


On Wed, Apr 28, 2010 at 05:16:14PM +0200, Andrea Arcangeli wrote:
> On Wed, Apr 28, 2010 at 03:57:38PM +0100, Mel Gorman wrote:
> > On Wed, Apr 28, 2010 at 03:23:56PM +0100, Mel Gorman wrote:
> > > On Wed, Apr 28, 2010 at 02:20:56AM +0200, Andrea Arcangeli wrote:
> > > > On Wed, Apr 28, 2010 at 09:13:45AM +0900, KAMEZAWA Hiroyuki wrote:
> > > > > Doing some check in move_ptes() after vma_adjust() is not safe.
> > > > > IOW, when vma's information and information in page-table is incosistent...objrmap
> > > > > is broken and migartion will cause panic.
> > > > >
> > > > > Then...I think there are 2 ways.
> > > > > 1. use seqcounter in "mm_struct" as previous patch and lock it at mremap.
> > > > > or
> > > > > 2. get_user_pages_fast() when do remap.
> > > >
> > > > 3 take the anon_vma->lock
> > > >
> > >
> > > <SNIP>
> > >
> > > Here is a different version of the same basic idea to skip temporary VMAs
> > > during migration. Maybe go with this?
> > >
> > > +static bool is_vma_temporary_stack(struct vm_area_struct *vma)
> > > +{
> > > + if (vma->vm_flags != VM_STACK_FLAGS)
> > > + return false;
> > > +
> > > + /*
> > > + * Only during exec will the total VM consumed by a process
> > > + * be exacly the same as the stack
> > > + */
> > > + if (vma->vm_mm->stack_vm == 1 && vma->vm_mm->total_vm == 1)
> > > + return true;
> > > +
> > > + return false;
> > > +}
> > > +
> >
> > The assumptions on the vm flags is of course totally wrong. VM_EXEC might
> > be applied as well as default flags from the mm. The following is the same
> > basic idea, skip VMAs belonging to processes in exec rather than trying
> > to hold anon_vma->lock across move_page_tables(). Not tested yet.
>
> This is better than the other, that made it look like people could set
> broken rmap at arbitrary times, at least this shows it's ok only
> during execve before anything run, but if we can't take the anon-vma
> lock really better than having to make these special checks inside
> every rmap_walk that has to be accurate, we should just delay the
> linkage of the stack vma into its anon-vma, until after the move_pages.
>

Is it possible to delay the linkage like that? As arguments get copied into
the temporary stack before it gets moved, I'd have expected the normal fault
path to prepare and attach the anon_vma. We could special case it but
that isn't very palatable either.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/