Re: [RFC PATCH 1/2] Fix: sched/membarrier: p->mm->membarrier_state racy load

From: Peter Zijlstra
Date: Wed Sep 04 2019 - 08:43:54 EST


On Wed, Sep 04, 2019 at 02:03:37PM +0200, Oleg Nesterov wrote:
> On 09/04, Peter Zijlstra wrote:
> >
> > + struct task_struct *g, *t;
> > +
> > + read_lock(&tasklist_lock);
> > + do_each_thread(g, t) {
>
> for_each_process_thread() looks better

Argh, I always get confused. Why do we have multiple version of this
again?

> > + if (t->mm == mm) {
> > + atomic_or(MEMBARRIER_STATE_GLOBAL_EXPEDITED,
> > + &t->membarrier_state);
> > + }
>
> then you also need to change dup_task_struct(), it should clear
> ->membarrier_state unless CLONE_VM.

Or, as you suggest below.

> And probably unuse_mm() should clear current->membarrier_state too.

How about we hard exclude PF_KTHREAD and ignore {,un}use_mm() entirely?

> Hmm. And it can race with copy_process() anyway, tasklist_lock can't
> really help. So copy_process() needs to do
>
> write_lock_irq(&tasklist_lock);
> ...
>
> if (clone_flags & CLONE_VM)
> p->membarrier_state = current->membarrier_state;
> else
> p->membarrier_state = 0;

Right you are.