Re: [patch] Re: spin_unlock optimization(i386)

Linus Torvalds (torvalds@transmeta.com)
Sun, 21 Nov 1999 11:05:47 -0800 (PST)


On Sun, 21 Nov 1999, Ingo Molnar wrote:
> >
> > I think it's possible to replace that with
> > "movl $0,%0"
> > which would be a simple, pairable single-tick instruction.
>
> [this very issue popped up a couple of days ago on FreeBSD mailing lists
> as well.]

It does NOT WORK!

Let the FreBSD people use it, and let them get faster timings. They will
crash, eventually.

The window may be small, but if you do this, then suddenly spinlocks
aren't reliable any more.

The issue is not writes being issued in-order (although all the Intel CPU
books warn you NOT to assume that in-order write behaviour - I bet it
won't be the case in the long run).

The issue is that you _have_ to have a serializing instruction in order to
make sure that the processor doesn't re-order things around the unlock.

For example, with a simple write, the CPU can legally delay a read that
happened inside the critical region (maybe it missed a cache line), and
get a stale value for any of the reads that _should_ have been serialized
by the spinlock.

Note that I actually thought this was a legal optimization, and for a
while I had this in the kernel. It crashed. In random ways.

Note that the fact that it does not crash now is quite possibly because
of either
- we have a lot less contention on our spinlocks these days. That might
hide the problem, because the _spinlock_ will be fine (the cache
coherency still means that the spinlock itself works fine - it's just
that it no longer works reliably as an exclusion thing)
- the window is probably very very small, and you have to be unlucky to
hit it. Faster CPU's, different compilers, whatever.

I might be proven wrong, but I don't think I am.

Note that another thing is that yes, "btcl" may be the worst possible
thing to use for this, and you might test whether a simpler "xor+xchgl"
might be better - it's still serializing because it is locked, but it
should be the normal 12 cycles that Intel always seems to waste on
serializing instructions rather than 22 cycles.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/