Re: [PATCH 3/3] 64-bit futexes: x86 support
From: Ulrich Drepper
Date: Sat May 31 2008 - 00:25:45 EST
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Linus Torvalds wrote:
> I'm suggesting that there are better approaches than forcing yet another
> new interface onto this thing. As mentioned, the kernel has had 32-bit
> rwsemaphores for a long time. Yes, there is "extra information" outside
> the 32-bit field, but I suspect you simply haven't looked hard enough at
> just using the existing futex code.
I guess have to expect to expect to be insulted by you. What else is
new? I think it's safe to say that nobody has looked more deeply at the
implementation possibilities with futexes. I don't know how many
different primitives I've written over the years. If you think you can
do it better, well, that's then a challenge to you. Come up with
something better.
The rwlocks cannot be handled as efficiently with 32-bit futexes.
I stand by that. Proof me wrong. You don't have to write code, just
describe it. But don't publish it if
- - the uncontended reader *unconditionally* uses more than one memory
access (atomic of course)
- - the contended readers uses the one access above plus one or two
compare-and-exchange depending on the rwlock preferring readers or
writers. The last compare-and-exchange has to be repeated for
spurious wakeups which are highly unlikely since we only wake what
is ready to use.
- - the uncontended writer needs more than one single compare-and-exchange
plus one memory read plus one write (both not atomic)
- - the contended writer needs more than the above plus one additional
compare-and-exchange for every time we delay in futex_wait (again,
highly unlikely to have spurious wakeups)
- - the unlock operation needs more than one read plus either one
(repeated) compare-and-exchange or another atomic operation plus one
memory write for writer locks
- - you can handle timedlock variants which don't manage to get the lock
efficiently without waking up anyone else
That's it.
And just to point out some more fun: in a highly contended case this is
a performance killer (MESI protocol states):
core #1 core #2
instr cache state instr cache state
load -> S
store -> M
-> S load -> S
store -> M
I.e., of the cache line with the rwlock is in one core's cache and you
first load a value from that cache line just to modify it right after
that, you're paying a terrible price.
Have fun coming up with something which is as good or better. But shall
we say there is a deadline? Otherwise we'd wait too long.
- --
â Ulrich Drepper â Red Hat, Inc. â 444 Castro St â Mountain View, CA â
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAkhA0zUACgkQ2ijCOnn/RHRI2QCbBGCcjES6qMJYay1WRVLdSAAk
lHEAnA754SGDsjuzvCYg2ZBrEUIktxYV
=NYYc
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/