Re: [PATCH v4 06/11] futex: Allow to re-allocate the private hash bucket.

From: Thomas Gleixner
Date: Wed Dec 11 2024 - 09:33:22 EST


On Tue, Dec 10 2024 at 23:27, Thomas Gleixner wrote:
> Why does unqueue() work w/o a hash bucket reference?
>
> unqueue(q)
> {

This actually needs a

guard(rcu);

to protect against a concurrent rehashing.

> retry:
> lock_ptr = READ_ONCE(q->lock_ptr);
> // Wake up ?
> if (!lock_ptr)
> return 0;
>
> spin_lock(lock_ptr);
>
> // This covers both requeue and rehash operations
> if (lock_ptr != q->lock_ptr) {
> spin_unlock(lock_ptr);
> goto retry;
> }
>
> __unqueue(q);
> spin_unlock(lock_ptr);
> }
>
> Nothing in unqueue() requires a reference on the hash. The lock pointer
> logic covers both requeue and rehash operations. They are equivalent,
> no?
>
> wake() is not really different. It needs to change the way how the
> private retry works:
>
> wake_op()
> {
> retry:
> get_key(key1);
> get_ket(key2);
>
> retry_private:
> double_get_and_lock(&hb1, &hb2, &key1, &key2);
> .....
> double_unlock_and_put(&hb1, &hb2);
> .....
> }
>
> Moving retry private before the point where the hash bucket is retrieved
> and locked is required in some other place too. And some places use
> q.lock_ptr under the assumption that it can't change, which probably
> needs reevaluation of the hash bucket. Other stuff like lock_pi() needs
> a seperation of unlocking the hash bucket and dropping the reference.
>
> But that are all minor changes.
>
> All of them can be done on a per function basis before adding the actual
> private hash muck, which makes the whole thing reviewable. This patch
> definitely does not qualify for reviewable.
>
> All you need are implementations for hb_get_and_lock/unlock_and_put()
> plus the double variants and a hash_put() helper. Those implementations
> use the global hash until all places are mopped up and then you can add
> the private magic in exatly those places
>
> There is not a single place where you need magic state fixups in the
> middle of the functions or conditional locking, which turns out to be
> not sufficient.
>
> The required helpers are:
>
> hb_get_and_lock(key)
> {
> if (private(key))
> hb = private_hash(key); // Gets a reference
> else
> hb = hash_bucket(global_hash, key);
> hb_lock(hb);
> return hb;
> }
>
> hb_unlock_and_put(hb)
> {
> hb_unlock(hb);
> if (private(hb))
> hb_private_put(hb);
> }
>
> The double lock/unlock variants are equivalent.
>
> private_hash(key)
> {
> scoped_guard(rcu) {
> hash = rcu_deref(current->mm->futex.hash);

This actually requires:

if (!hash)
return global_hash;

otherwise this results in a NULL pointer dereference, aka. unpriviledged
DoS when a single threaded process invokes sys_futex(...) directly.

That begs the question whether current->mm->futex.hash should be
initialized with &global_hash in the first place and &global_hash having
a reference count too, which never can go to zero. That would simplify
the whole logic there.

Thanks,

tglx