Re: [RFC PATCH 1/2] Fix: sched/membarrier: p->mm->membarrier_state racy load

From: Peter Zijlstra
Date: Wed Sep 04 2019 - 07:39:39 EST


On Wed, Sep 04, 2019 at 12:53:49PM +0200, Oleg Nesterov wrote:
> On 09/03, Mathieu Desnoyers wrote:
> >
> > @@ -1130,6 +1130,10 @@ struct task_struct {
> > unsigned long numa_pages_migrated;
> > #endif /* CONFIG_NUMA_BALANCING */
> >
> > +#ifdef CONFIG_MEMBARRIER
> > + atomic_t membarrier_state;
> > +#endif
>
> ...
>
> > +static inline void membarrier_prepare_task_switch(struct task_struct *t)
> > +{
> > + if (!t->mm)
> > + return;
> > + atomic_set(&t->membarrier_state,
> > + atomic_read(&t->mm->membarrier_state));
> > +}
>
> Why not
>
> rq->membarrier_state = next->mm ? t->mm->membarrier_state : 0;
>
> and
>
> if (cpu_rq(cpu)->membarrier_state & MEMBARRIER_STATE_GLOBAL_EXPEDITED) {
> ...
> }
>
> in membarrier_global_expedited() ? (I removed atomic_ to simplify)
>
> IOW, why this new member has to live in task_struct, not in rq?

It could be like the above; but then we're still reading
mm->membarrier_state in the scheduler path.

The patch I just send avoids event that, at the cost of doing a
do_each_thread/while_each_thread iteration in the uncommon case where we
register a process after it grows threads.