Re: spin_lock forgets to clobber memory and other smp fixes [was Re: [patch] waitqueue optimization, 2.4.0-test7]

From: Franz Sirl (Franz.Sirl-kernel@lauterbach.com)
Date: Thu Sep 07 2000 - 10:44:20 EST


At 17:03 07.09.00, Andrea Arcangeli wrote:
>On Mon, 4 Sep 2000, Andrea Arcangeli wrote:
>
> >barrier()). I also noticed __sti()/__save_flags() doesn't need to clobber
> >"memory".
>
>I'm not sure anymore if __sti and spin_unlock() doesn't need to clobber
>memory (it looks necessary to make sure the compiler doesn't delay to
>write data to the memory out of the critical section).
>
>And in practice I can't reproduce any corruption with any subtle testcase
>by removing "memory" from the clobber list of all the
>spinlocks/__sti/__cli, so it may be the other way around. Maybe we can
>rely on the __volatile__ statement of the asm that will enforce that if we
>write:
>
> *p = 0;
> __asm__ __volatile__("" : :);
> *p = 1;
>
>in the assembler we'll then find both a write of 0 and then a write of 1
>to memory. I'd say with the current compiler we can rely on it (infact the
>spinlock doesn't seems to get miscompiled at the moment even while they
>aren't clobbering "memory".
>
>Same with:
>
> int a = *p;
> __asm__ __volatile__("" : :);
> a = *p;
>
>(to do two explicit reads)
>
>If "memory" isn't necessary in spin_lock, then all the "memory" clobbers
>around include/*/* (also in mb() and __cli() and even barrier()) are not
>necessary. But then "why" we need a "memory" clobber in first place if the
>"memory" isn't necessary? :)) Also the gcc documentation seems to say we
>need "memory" in all such operations (also in __sti() and
>spin_unlock() unlike I said in my earlier email).
>
>I'd like if some compiler guy could comment (I got no one reply yet). I
>tried to grep gcc but my gcc knowledge is too low to reverse engeneer the
>implement semantics of the "memory" clobber fast (though I would
>appreciate a pointer in the gcc sources to start to understand some gcc
>code :).

In short terms:

- __volatile__ assures that the code isn't reordered against other
__volatile__ and isn't hoisted out of loops, nothing else
- the "memory" clobber makes sure the asm isn't reordered against other
memory accesses

Essentially, you _always_ have to tell the compiler if you touch memory
behind it's back, this includes inline assembly to flush the cache or the
write back queue. Since you usually don't know which part of the memory
gets touched, you need the global "memory" clobber. If you know which
addresses you touch, you can get away with something like this, from
asm-ppc/io.h:

extern inline unsigned in_be32(volatile unsigned *addr)
{
         unsigned ret;

         __asm__ __volatile__("lwz%U1%X1 %0,%1; eieio" : "=r" (ret) : "m"
(*addr));
         return ret;
}

extern inline void out_le32(volatile unsigned *addr, int val)
{
         __asm__ __volatile__("stwbrx %1,0,%2; eieio" : "=m" (*addr) :
                              "r" (val), "r" (addr));
}

General rule of thumb for inline assembly:

   Give the compiler as much information as possible!!

If you know inline assembly read/writes memory, tell it to the compiler, as
detailed as possible!

Franz.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Sep 07 2000 - 21:00:30 EST