Re: [patch 2/3] scheduler: add full memory barriers upon task switchat runqueue lock/unlock

From: Linus Torvalds
Date: Mon Feb 01 2010 - 15:44:40 EST




On Mon, 1 Feb 2010, Mathieu Desnoyers wrote:
>
> The two event pairs we are looking at are:
>
> Pair 1)
>
> * memory accesses (load/stores) performed by user-space thread before
> context switch.
> * cpumask_clear_cpu(cpu, mm_cpumask(prev));
>
> Pair 2)
>
> * cpumask_set_cpu(cpu, mm_cpumask(next));
> * memory accessses (load/stores) performed by user-space thread after
> context switch.

So explain why does that smp_mb() in between the two _help_?

The user of this will do a

for_each_cpu(mm_cpumask)
send_IPI(cpu, smp_mb);

but that's not an atomic op _anyway_. So you're reading mm_cpumask
somewhere earlier, and doing the send_IPI later. So look at the whole
scenario 2:

cpumask_set_cpu(cpu, mm_cpumask(next));
memory accessses performed by user-space

and think about it from the perspective of another CPU. What does an
smp_mb() in between the two do?

I'll tell you - it does NOTHING. Because it doesn't matter. I see no
possible way another CPU can care, because let's assume that the other CPU
is doing that

for_each_cpu(mm_cpumask)
send_ipi(smp_mb);

and you have to realize that the other CPU needs to read that mm_cpumask
early in order to do that.

So you have this situation:

CPU1 CPU2
---- ----

cpumask_set_cpu
read mm_cpumask
smp_mb
smp_mb
user memory accessses
send_ipi

and exactly _what_ is that "smp_mb" on CPU1 protecting against?

Realize that CPU2 is not ordered (because you wanted to avoid the
locking), so the "read mm_cpumask" can happen before or after that
cpumask_set_cpu. And it can happen before or after REGARDLESS of that
smp_mb. The smp_mb doesn't make any difference to CPU2 that I can see.

So the question becomes one of "How can CPU2 care about whether CPU1 is in
the mask"? Considering that CPU2 doesn't do any locking, I don't see any
way you can get a "consistent" CPU mask _regardless_ of any smp_mb's in
there. When it does the "read mm_cpumask()" it might get the value
_before_ the cpumask_set_cpu, and it might get the value _after_, and
that's true regardless of whether there is a smp_mb there or not.

See what I'm asking for? I'm asking for why it matters that we have a
memory barrier, and why that mm_cpumask is so magical that _that_ access
matters so much.

Maybe I'm dense. But If somebody puts memory barriers in the code, I want
to know exactly what the reason for the barrier is. Memory ordering is too
subtle and non-intuitive to go by gut feel.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/