Re: [PATCH] [PATCH] Gaurantee spinlocks implicit barrier for !PREEMPT_COUNT

From: Linus Torvalds
Date: Mon Apr 08 2013 - 00:48:27 EST


On Sun, Apr 7, 2013 at 9:20 PM, Vineet Gupta <Vineet.Gupta1@xxxxxxxxxxxx> wrote:
>
> Christian had already proposed that change - only I was reluctant to take it - as
> local_irq_* is used heavily in a configuration of ARC700 linux where (!SMP) cpu
> doesn't support load-locked/store-conditional instructions - hence atomic R-M-W
> sequences need those routines. Adding a mem clobber would lead to pathetic
> generated code hence the reason it was likely removed by me at some point in time.

Umm. The irq instructions absolutely *HAVE* to have the barriers. And
yes, the R-M-W instructions (if they are protected by them) obviously
need to have the load and the store inside the spinlocked region.

> Also I'd thought that given that barriers are already present in PREEMPT_COUNT
> variant of preempt_* macros we might as well add them to !PREEMPT_COUNT ones.

That's absolutely insane. It's insane for two reasons:

- it's not SUFFICIENT. Any user of irq-safety needs the loads and
stores to stay inside the irq-safe region, and the preempt macros have
absolutely nothing to do with that.

- it's POINTLESS and WRONG. If you run on UP, and have preemption
disabled, then there is no reason for a barrier in the preempt code,
since moving code around them doesn't matter - nothing is ever going
to preempt that.

The thing is, the irq disable/enable sequence absolutely has to have a
memory barrier in it, because if the compiler moves a load outside of
the irq-protected region, then that is a bug. Now, if you want to play
games with your atomics and do them with handcoded asm, go right
ahead, but that really isn't an argument for getting everything else
wrong.

That said, thinking about barriers and preemption made me realize that
we do have a potential issue between: (a) non-preemption UP kernel
(with no barriers in the preempt_enable/disable()) and (b)
architectures that use inline asm without a memory clobber for
get_user/put_user(). That includes x86.

The reason is that I could imagine code like

if (get_user(val, addr))
return -EFAULT;
preempt_disable();
... do something percpu ..
preempt_enable();

and it turns out that for non-preempt UP, we don't tell the compiler
anywhere that it can't move the get_user past the preempt_disable. But
the get_user() can cause a preemption point because of a page fault,
obviously.

I suspect that the best way to fix this ends up relying on the gcc
"asm volatile" rules, and make the rule be that:
- preempt_disable/enable have to generate an asm volatile() string
(preferably just a ASM comment saying "preempt disable/enable")
- get_user/put_user doesn't need to add a memory clobber, but needs
to be done with an asm volatile too.

Then the gcc "no reordering of volatile asms" should make the above be
ok, without having to add an actual compiler memory barrier.

Ingo? Peter? I'm not sure anybody really uses UP+no-preempt on x86,
but it does seem to be a bug.. Comments?

But the irq disable/enable sequences definitely need the memory
clobber. Because caching memory that could be changed by an interrupt
in registers is not good.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/