Re: [PATCH bpf-next v1 00/22] Resilient Queued Spin Lock
From: Kumar Kartikeya Dwivedi
Date: Wed Jan 08 2025 - 15:13:25 EST
On Wed, 8 Jan 2025 at 14:48, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Tue, Jan 07, 2025 at 03:54:36PM -0800, Linus Torvalds wrote:
> > On Tue, 7 Jan 2025 at 06:00, Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote:
> > >
> > > This patch set introduces Resilient Queued Spin Lock (or rqspinlock with
> > > res_spin_lock() and res_spin_unlock() APIs).
> >
> > So when I see people doing new locking mechanisms, I invariably go "Oh no!".
> >
> > But this series seems reasonable to me. I see that PeterZ had a couple
> > of minor comments (well, the arm64 one is more fundamental), which
> > hopefully means that it seems reasonable to him too. Peter?
>
> I've not had time to fully read the whole thing yet, I only did a quick
> once over. I'll try and get around to doing a proper reading eventually,
> but I'm chasing a regression atm, and then I need to go review a ton of
> code Andrew merged over the xmas/newyears holiday :/
>
> One potential issue is that qspinlock isn't suitable for all
> architectures -- and I've yet to figure out widely BPF is planning on
> using this.
For architectures where qspinlock is not available, I think we can
have a fallback to a test and set lock with timeout and deadlock
checks, like patch 12.
We plan on using this in BPF core and BPF maps, so the usage will be
pervasive, and we have atleast one architecture in CI (s390) which
doesn't have ARCH_USER_QUEUED_SPINLOCK selected, so we should have
coverage for both cases. For now the fallback is missing, but I will
add one in v2.
> Notably qspinlock is ineffective (as in way over engineered)
> for architectures that do not provide hardware level progress guarantees
> on competing atomics and qspinlock uses mixed sized atomics, which are
> typically under specified, architecturally.
Yes, we also noticed during development that try_cmpxchg_tail (in
patch 9) couldn't rely on 16-bit cmpxchg being available everywhere (I
think the build broke on arm64), unlike 16-bit xchg which is used in
xchg_tail, but otherwise we should be using 32-bit atomics or relying
on mixed sized atomics similar to qspinlock.
>
> Another issue is the code duplication.
I agree that this isn't ideal, but IMO it would be too ugly to ifdef
parts of qspinlock slow path to accommodate rqspinlock logic, and it
will get harder to reason about. Plus there's distinct return types
for both slow paths, which means if we combine them we end up with the
normal qspinlock returning a value, which isn't very meaningful. We
can probably discuss more code sharing possibilities through common
inline functions to minimize duplication though.
>
> Anyway, I'll get to it eventually...