Jamie,
I should have never brought this example up. The thread "spin_unlock()
optimization" that went on for three weeks with Intel and the whole
planet getting into the fray speaks for itself, and anyone wanting to
search the kernel archives can do so and for their own opinion. I won't
use a Linus example in the future since this seems to turn off some
folks reason centers and they go into "attack mode". I wasn't intending
to offend, just cite an example for Al Viro relative to why code reviews
alone don't give all the perspectives on the whole topic of a kernel
debugger in Linux.
:-)
Jeff
Jamie Lokier wrote:
>
> Jeff V. Merkey wrote:
> > To cite a Linux specific example, let's take the issue with the memory
> > write for a spin_unlock(). Linus seemed to have trouble grasping why
> > a simple ' mov <addr>, 0' would work as opposed to a 'lock dec
> > <addr.>'
>
> No logic analyser will tell you the subtleties of _why_ it works.
> You'll see MESI working, you'll see processor ordering in your test
> cases, but that doesn't tell you whether processor ordering works.
>
> > Anyone who has ever spent late nights with an American Arium Analyzer
> > profiling memory bus transactions on a PPro knows that MESI [...] will
> > correctly propogate via the processor caches a write to a locked
> > location with a correct load and stor oder without any problems of
> > locking concurrency.
>
> Wrong. They observe no locking problems with their particular test
> cases. The logic analyser doesn't tell you that no code sequence will
> exhibit a locking problem. It also doesn't mean that no future
> processor will exhibit the problem.
>
> Instead of using a bus analyzer to see that there's no _symptom_, the
> kernel developers looked at Intel's specifications. A guy from Intel
> helped with that. Eventually it was confirmed that Intel does actually
> guarantee 'movb' works for spin-unlock.
>
> At the same time, a few folks ran tests on a number of processors to see
> if the ordering specifications were really followed. A lot of
> misunderstanding and confusion did result from that.
>
> Some tests failed, but they were actually the wrong tests for
> spin-unlock, which is ok with 'movb'. They were the right tests for
> some other subtle ordering problems though.
>
> In the process, many of us learned a lot about x86 read-write ordering
> rules. Through this, other bugs were found. See __set_task_state for
> example.
>
> If someone had just use the logic analyser, we'd never have constructed
> the wrong threading tests, and we probably wouldn't have spotted the
> task_state bug.
>
> > Linus' apparently did not understand this, or he would have
> > immediately realized that double locking was always generating a
> > second non-cacheable memory reference for every lock being taken and
> > released.
>
> Erm, I think we _all_ knew about the second memory reference...
> But non-cacheable? On a PPro lock operations are cacheable.
>
> > The person writing and updating page table entries in NetWare 4.1 was
> > clearing the accessed bit in the PTE and did not know that the
> > processor would assert a hidden R/M/W operation and assert a bus lock
> > to set this bit everytime someone cleared it -- it made performance
> > drop 4% from NetWare 3.X and noone knew why. This performance problem
> > would have never been found without this tool. 2 years of code
> > reviews did not find it -- an American Ariun Analyser with a kernel
> > debugger to stop and start and instrument the code with writing custom
> > stubs all over the place did.
>
> The kernel developers have known about those R/M/W "hidden cycles"
> forever. See any standard Pentium textbook (or even 386/486 for that
> matter).
>
> Heck, even _I_ know this stuff and I've never programmed any page table
> code, just read those parts of the kernel.
>
> > Folks who just rely on code reviews never see this level of
> > interaction, and conversely, do not have the understanding of hardware
> > behavior underneath an OS to optimize it well.
>
> Apparently your engineers didn't read the textbooks.
>
> -- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri Sep 15 2000 - 21:00:15 EST