Re: [tip: sched/urgent] sched/membarrier: Fix p->mm->membarrier_state racy load

From: Ingo Molnar
Date: Tue Oct 01 2019 - 04:44:12 EST



* tip-bot2 for Mathieu Desnoyers <tip-bot2@xxxxxxxxxxxxx> wrote:

> The following commit has been merged into the sched/urgent branch of tip:
>
> Commit-ID: 227a4aadc75ba22fcb6c4e1c078817b8cbaae4ce
> Gitweb: https://git.kernel.org/tip/227a4aadc75ba22fcb6c4e1c078817b8cbaae4ce
> Author: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> AuthorDate: Thu, 19 Sep 2019 13:37:02 -04:00
> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> CommitterDate: Wed, 25 Sep 2019 17:42:30 +02:00
>
> sched/membarrier: Fix p->mm->membarrier_state racy load

> + rcu_read_unlock();
> if (!fallback) {
> preempt_disable();
> smp_call_function_many(tmpmask, ipi_mb, NULL, 1);
> @@ -136,6 +178,7 @@ static int membarrier_private_expedited(int flags)
> }
>
> cpus_read_lock();
> + rcu_read_lock();
> for_each_online_cpu(cpu) {
> struct task_struct *p;
>
> @@ -157,8 +200,8 @@ static int membarrier_private_expedited(int flags)
> else
> smp_call_function_single(cpu, ipi_mb, NULL, 1);
> }
> - rcu_read_unlock();
> }
> + rcu_read_unlock();

I noticed this too late, but the locking in this part is now bogus:

rcu_read_lock();
for_each_online_cpu(cpu) {
struct task_struct *p;

/*
* Skipping the current CPU is OK even through we can be
* migrated at any point. The current CPU, at the point
* where we read raw_smp_processor_id(), is ensured to
* be in program order with respect to the caller
* thread. Therefore, we can skip this CPU from the
* iteration.
*/
if (cpu == raw_smp_processor_id())
continue;
rcu_read_lock();
p = rcu_dereference(cpu_rq(cpu)->curr);
if (p && p->mm == mm)
__cpumask_set_cpu(cpu, tmpmask);
}
rcu_read_unlock();

Note the double rcu_read_lock() ....

This bug is now upstream, so requires an urgent fix, as it should be
trivial to trigger with pretty much any membarrier user.

Thanks,

Ingo