Re: [PATCH bpf-next v2 11/26] rqspinlock: Add deadlock detection and recovery

From: Alexei Starovoitov
Date: Fri Feb 07 2025 - 20:54:14 EST


On Thu, Feb 6, 2025 at 2:54 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote:
>
> +/*
> + * It is possible to run into misdetection scenarios of AA deadlocks on the same
> + * CPU, and missed ABBA deadlocks on remote CPUs when this function pops entries
> + * out of order (due to lock A, lock B, unlock A, unlock B) pattern. The correct
> + * logic to preserve right entries in the table would be to walk the array of
> + * held locks and swap and clear out-of-order entries, but that's too
> + * complicated and we don't have a compelling use case for out of order unlocking.
> + *
> + * Therefore, we simply don't support such cases and keep the logic simple here.
> + */

The comment looks obsolete from the old version of this patch.
Patch 25 is now enforces the fifo order in the verifier
and code review will do the same for use of res_spin_lock()
in bpf internals. So pls drop the comment or reword.

> +static __always_inline void release_held_lock_entry(void)
> +{
> + struct rqspinlock_held *rqh = this_cpu_ptr(&rqspinlock_held_locks);
> +
> + if (unlikely(rqh->cnt > RES_NR_HELD))
> + goto dec;
> + WRITE_ONCE(rqh->locks[rqh->cnt - 1], NULL);
> +dec:
> + this_cpu_dec(rqspinlock_held_locks.cnt);

..
> + * We don't have a problem if the dec and WRITE_ONCE above get reordered
> + * with each other, we either notice an empty NULL entry on top (if dec
> + * succeeds WRITE_ONCE), or a potentially stale entry which cannot be
> + * observed (if dec precedes WRITE_ONCE).
> + */
> + smp_wmb();

since smp_wmb() is needed to address ordering weakness vs try_cmpxchg_acquire()
would it make sense to move it before this_cpu_dec() to address
the 2nd part of the harmless race as well?