On Wed, Jan 8, 2025 at 5:11 PM Waiman Long <llong@xxxxxxxxxx> wrote:
There is no 'last lock'. If it's not an AA deadlock there are more
Most of the users use rqspinlock because it is expected a deadlock mayIn most cases, lockdep will report a cyclic locking dependency
be constructed at runtime (either due to BPF programs or by attaching
programs to the kernel), so lockdep splats will not be helpful on
debug kernels.
(potential deadlock) before a real deadlock happens as it requires the
right combination of events happening in a specific sequence. So lockdep
can report a deadlock while the runtime check of rqspinlock may not see
it and there is no locking stall. Also rqspinlock will not see the other
locks held in the current context.
Say if a mix of both qspinlock and rqspinlock were involved in an ABBAThat is true only if the latest lock to be acquired is a rqspinlock. If.
situation, as long as rqspinlock is being acquired on one of the
threads, it will still timeout even if check_deadlock fails to
establish presence of a deadlock. This will mean the qspinlock call on
the other side will make progress as long as the kernel unwinds locks
correctly on failures (by handling rqspinlock errors and releasing
held locks on the way out).
all the rqspinlocks in the circular path have already been acquired, no
unwinding is possible.
than 1 cpu that are spinning. In a hypothetical mix of rqspinlocks
and regular raw_spinlocks at least one cpu will be spinning on
rqspinlock and despite missing the entries in the lock table it will
still exit by timeout. The execution will continue and eventually
all locks will be released.
We considered annotating rqspinlock as trylock with
raw_spin_lock_init lock class, but usefulness is quite limited.
It's trylock only. So it may appear in a circular dependency
only if it's a combination of raw_spin_locks and rqspinlocks
which is not supposed to ever happen once we convert all bpf inner
parts to rqspinlock.
Patches 17,18,19 convert the main offenders. Few remain
that need a bit more thinking.
At the end all locks at the leaves will be rqspinlocks and
no normal locks will be taken after
(unless NMIs are doing silly things).
And since rqspinlock is a trylock, lockdep will never complain
on rqspinlock.
Even if NMI handler is buggy it's unlikely that NMI's raw_spin_lock
is in a circular dependency with rqspinlock on bpf side.
So rqspinlock entries will be adding computational
overhead to lockdep engine to filter out and not much more.
This all assumes that rqspinlocks are limited to bpf, of course.
If rqspinlock has use cases beyond bpf then, sure, let's add
trylock lockdep annotations.
Note that if there is an actual bug on bpf side with rqspinlock usage
it will be reported even when lockdep is off.
This is patch 13.
Currently it's pr_info() of held rqspinlocks and dumpstack,
but in the future we plan to make it better consumable by bpf
side. Printing into something like a special trace_pipe.
This is tbd.