Re: Local execution of ipi_sync_rq_state() on sync_runqueues_membarrier_state()

From: Mathieu Desnoyers
Date: Wed Feb 17 2021 - 09:54:59 EST


----- On Feb 16, 2021, at 4:35 PM, Nadav Amit nadav.amit@xxxxxxxxx wrote:

> Hello Mathieu,
>
> While trying to find some unrelated by, something in
> sync_runqueues_membarrier_state() caught my eye:
>
>
> static int sync_runqueues_membarrier_state(struct mm_struct *mm)
> {
> if (atomic_read(&mm->mm_users) == 1 || num_online_cpus() == 1) {
> this_cpu_write(runqueues.membarrier_state, membarrier_state);
>
> /*
> * For single mm user, we can simply issue a memory barrier
> * after setting MEMBARRIER_STATE_GLOBAL_EXPEDITED in the
> * mm and in the current runqueue to guarantee that no memory
> * access following registration is reordered before
> * registration.
> */
> smp_mb();
> return 0;
> }
>
> [ snip ]
>
> smp_call_function_many(tmpmask, ipi_sync_rq_state, mm, 1);
>
>
> And ipi_sync_rq_state() does:
>
> this_cpu_write(runqueues.membarrier_state,
> atomic_read(&mm->membarrier_state));
>
>
> So my question: are you aware smp_call_function_many() would not run
> ipi_sync_rq_state() on the local CPU?

Generally, yes, I am aware of it, but it appears that when I wrote that
code, I missed that important fact. See

commit 227a4aadc75b ("sched/membarrier: Fix p->mm->membarrier_state racy load")

> Is that the intention of the code?

Clearly not! If we look at sync_runqueues_membarrier_state(), there is even a
special-case for mm_users==1 || num online cpus == 1 where it writes the membarrier
state into the current cpu runqueue. I'll prepare a fix, thanks a bunch for spotting
this.

Mathieu

>
> Thanks,
> Nadav

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com