Re: [patch 2/3] scheduler: add full memory barriers upon task switchat runqueue lock/unlock
From: Linus Torvalds
Date: Mon Feb 01 2010 - 15:44:40 EST
On Mon, 1 Feb 2010, Mathieu Desnoyers wrote:
>
> The two event pairs we are looking at are:
>
> Pair 1)
>
> * memory accesses (load/stores) performed by user-space thread before
> context switch.
> * cpumask_clear_cpu(cpu, mm_cpumask(prev));
>
> Pair 2)
>
> * cpumask_set_cpu(cpu, mm_cpumask(next));
> * memory accessses (load/stores) performed by user-space thread after
> context switch.
So explain why does that smp_mb() in between the two _help_?
The user of this will do a
for_each_cpu(mm_cpumask)
send_IPI(cpu, smp_mb);
but that's not an atomic op _anyway_. So you're reading mm_cpumask
somewhere earlier, and doing the send_IPI later. So look at the whole
scenario 2:
cpumask_set_cpu(cpu, mm_cpumask(next));
memory accessses performed by user-space
and think about it from the perspective of another CPU. What does an
smp_mb() in between the two do?
I'll tell you - it does NOTHING. Because it doesn't matter. I see no
possible way another CPU can care, because let's assume that the other CPU
is doing that
for_each_cpu(mm_cpumask)
send_ipi(smp_mb);
and you have to realize that the other CPU needs to read that mm_cpumask
early in order to do that.
So you have this situation:
CPU1 CPU2
---- ----
cpumask_set_cpu
read mm_cpumask
smp_mb
smp_mb
user memory accessses
send_ipi
and exactly _what_ is that "smp_mb" on CPU1 protecting against?
Realize that CPU2 is not ordered (because you wanted to avoid the
locking), so the "read mm_cpumask" can happen before or after that
cpumask_set_cpu. And it can happen before or after REGARDLESS of that
smp_mb. The smp_mb doesn't make any difference to CPU2 that I can see.
So the question becomes one of "How can CPU2 care about whether CPU1 is in
the mask"? Considering that CPU2 doesn't do any locking, I don't see any
way you can get a "consistent" CPU mask _regardless_ of any smp_mb's in
there. When it does the "read mm_cpumask()" it might get the value
_before_ the cpumask_set_cpu, and it might get the value _after_, and
that's true regardless of whether there is a smp_mb there or not.
See what I'm asking for? I'm asking for why it matters that we have a
memory barrier, and why that mm_cpumask is so magical that _that_ access
matters so much.
Maybe I'm dense. But If somebody puts memory barriers in the code, I want
to know exactly what the reason for the barrier is. Memory ordering is too
subtle and non-intuitive to go by gut feel.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/