Re: Futex hash_bucket lock can break isolation and cause priority inversion on RT

From: André Almeida
Date: Tue Oct 08 2024 - 11:59:48 EST


Em 08/10/2024 12:51, Sebastian Andrzej Siewior escreveu:
On 2024-10-08 12:38:11 [-0300], André Almeida wrote:
Em 08/10/2024 12:22, Juri Lelli escreveu:

[...]

Now, of course by making the latency sensitive application tasks use a
higher priority than anything on housekeeping CPUs we could avoid the
issue, but the fact that an implicit in-kernel link between otherwise
unrelated tasks might cause priority inversion is probably not ideal?
Thus this email.

Does this report make any sense? If it does, has this issue ever been
reported and possibly discussed? I guess it’s kind of a corner case, but
I wonder if anybody has suggestions already on how to possibly try to
tackle it from a kernel perspective.


That's right, unrelated apps can share the same futex bucket, causing those
side effects. The bucket is determined by futex_hash() and then tasks get
the hash bucket lock at futex_q_lock(), and none of those functions have
awareness of priorities.

almost. Since Juri mentioned PREEMPT_RT the hb locks are aware of
priorities. So in his case there was a PI boost, the task with the
higher priority can grab the hb lock before others may however since the
owner is blocked by the NIC thread, it can't make progress.
Lifting the priority over the NIC-thread would bring the owner on the
CPU in order to drop the hb lock.


Oh that's right, thanks for pointing it out!

There's this work from Thomas that aims to solve corner cases like this, by
giving apps the option to instead of using the global hash table, to have
their own allocated wait queue:
https://lore.kernel.org/lkml/20160402095108.894519835@xxxxxxxxxxxxx/

"Collisions on that hash can lead to performance degradation
and on real-time enabled kernels to unbound priority inversions."

This is correct. The problem is also that the hb lock is hashed on
several things so if you restart/ reboot you may no longer share the hb
lock with the "bad" application.

Now that I think about it, of all things we never tried a per-process
(shared by threads) hb-lock which could also be hashed. This would avoid
blocking on other applications, your would have to blame your own threads.


So if every process has it owns hb-lock, every process has their own bucket? It would act just like a linked list then?

Thanks!
Juri

Sebastian