On Wed, 23 Nov 2005, Daniel Jacobowitz wrote:
Why should we use a silicon based solution for this, when I posit that
there are simpler and equally effective userspace solutions?
Name them.
In user space, doing things like clever run-time linking things is actually horribly bad. It causes COW faults at startup, and/or makes the compiler have to do indirections unnecessarily. Both of which actually make caches less effective, because now processes that really effectively do have exactly the same contents have them in different pages.
The other alternative (which apparently glibc actually does use) is to dynamically branch over the lock prefixes, which actually works better: it's more work dynamically, but it's much cheaper from a startup standpoint and there's no memory duplication, so while it is the "stupid" approach, it's actually better than the clever one.
The third alternative is to know at link-time that the process never does anything threaded, but that needs more developer attention and non-standard setups, and you _will_ get it wrong (some library will create some thread without the developer even realizing). It also has the duplicated library overhead (but at least now the duplication is just twice, not "each process duplicates its own private pointer")
In short, there simply isn't any good alternatives. The end result is that thread-safe libraries are always in practice thread-safe even on UP, even though that serializes the CPU altogether unnecessarily.
I'm sure you can make up alternatives every time you hit one _particular_ library, but that just doesn't scale in the real world.
In contrast, the simple silicon support scales wonderfully well. Suddenly libraries can be thread-safe _and_ efficient on UP too. You get to eat your cake and have it too.