Re: 2.6.0-test6 oops futex"

From: Jamie Lokier
Date: Tue Sep 30 2003 - 20:02:41 EST


Hugh Dickins wrote:
> I can't quite make the flaw I do see, fit the facts Klaus sees.
>
> Consider unqueue_me on one cpu racing against this->key = key2 in
> futex_requeue on another. unqueue_me finds the right bh to lock
> from q->key, but futex_requeue is shifting q->key beneath it.
> Although futex_requeue holds both the relevant spinlocks, during
> reassignment of key an irrelevant hashqueue may get calculated
> and locked instead.
>
> Whether or not this is relevant to Klaus' oops, it does need to be
> fixed, doesn't it? And your new key_refs version even more exposed?
> I'm giving up on it, but expect you or Rusty will be ingenious enough
> to rescue the split locks somehow.

Superb analysis!

It does indeed explain Klaus' oops. unqueue_me() and futex_poll()
hash using &q->key outside any spinlocks, so if there's a concurrent
futex_requeue() the hash can be corrupt. Result: wrong lock taken;
list manipulation races in test6, not in test5. (Provided Klaus has
SMP or pre-empt).

Solutions are to call hash_futex() inside the lock, or store the
hash result inside futex_q, and read it inside the lock.

You're right, the key_refs version is more exposed, but because of the
change to futux_requeue logic rather than anything to do with key_refs.

> (Oh, while you're there, be nice to fix nr_requeue 0.)

Logically, zero num in FUTEX_QUEUE shouldn't wake anything either, but
it's understood that at least one is always woken. It's a bit of a
wart I agree, but I'll go along with Ulrich's view.

Thanks,
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/