Re: [PATCH 06/12] x86/mm: Enable and use the arch_pgd_init_late() method

From: Ingo Molnar
Date: Sat Jun 13 2015 - 02:53:09 EST



* Ingo Molnar <mingo@xxxxxxxxxx> wrote:

> * Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>
> > On 06/11, Ingo Molnar wrote:
> > >
> > > +void arch_pgd_init_late(struct mm_struct *mm, pgd_t *pgd)
> > > +{
> > > + /*
> > > + * This is called after a new MM has been made visible
> > > + * in fork() or exec().
> > > + *
> > > + * This barrier makes sure the MM is visible to new RCU
> > > + * walkers before we initialize it, so that we don't miss
> > > + * updates:
> > > + */
> > > + smp_wmb();
> >
> > I can't understand the comment and the barrier...
> >
> > Afaics, we need to ensure that:
> >
> > > + if (pgd_val(*pgd_src))
> > > + WRITE_ONCE(*pgd_dst, *pgd_src);
> >
> > either we notice the recent update of this PGD, or (say) the subsequent
> > sync_global_pgds() can miss the child.
> >
> > How the write barrier can help?
>
> So the real thing this pairs with is the earlier:
>
> tsk->mm = mm;
>
> plus the linking of the new task in the task list.
>
> _that_ write must become visible to others before we do the (conditional) copy
> ourselves.
>
> Granted, it happens quite a bit earlier, and the task linking's use of locking
> is a natural barrier - but since this is lockless I didn't want to leave a
> silent assumption in.
>
> Perhaps remove the barrier and just leave a comment in that describes the
> assumption on task-linking being a full barrier?

Ah, there's another detail I forgot. This might handle the fork case, but in
exec() we have:

tsk->mm = mm;
arch_pgd_init_late(mm);

and since the task is already linked, here we need the barrier.

So how about I improve the comment to:

/*
* This function is called after a new MM has been made visible
* in fork() or exec() via:
*
* tsk->mm = mm;
*
* This barrier makes sure the MM is visible to new RCU
* walkers before we initialize the pagetables below, so that
* we don't miss updates:
*/
smp_wmb();

and leave the barrier there?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/