Re: [PATCH bpf-next v1 12/22] rqspinlock: Add basic support for CONFIG_PARAVIRT
From: Alexei Starovoitov
Date: Wed Jan 08 2025 - 22:37:41 EST
On Wed, Jan 8, 2025 at 6:58 PM Waiman Long <llong@xxxxxxxxxx> wrote:
>
>
> On 1/8/25 9:42 PM, Alexei Starovoitov wrote:
> > On Wed, Jan 8, 2025 at 4:48 PM Waiman Long <llong@xxxxxxxxxx> wrote:
> >> Is the intention to only replace raw_spinlock_t by rqspinlock but never
> >> spinlock_t?
> > Correct. We brainstormed whether we can introduce resilient mutex
> > for sleepable context, but it's way out of scope and PI
> > considerations are too complex to think through.
> > rqspinlock is a spinning lock, so it's a replacement for raw_spin_lock
> > and really only for bpf use cases.
> Thank for the confirmation. I think we should document the fact that
> rqspinlock is a replacement for raw_spin_lock only in the rqspinlock.c
> file to prevent possible abuse in the future.
Agreed.
> >
> > We considered placing rqspinlock.c in kernel/bpf/ directory
> > to discourage any other use beyond bpf,
> > but decided to keep in kernel/locking/ only because
> > it's using mcs_spinlock.h and qspinlock_stat.h
> > and doing #include "../locking/mcs_spinlock.h"
> > is kinda ugly.
> >
> > Patch 16 does:
> > +++ b/kernel/locking/Makefile
> > @@ -24,6 +24,9 @@ obj-$(CONFIG_SMP) += spinlock.o
> > obj-$(CONFIG_LOCK_SPIN_ON_OWNER) += osq_lock.o
> > obj-$(CONFIG_PROVE_LOCKING) += spinlock.o
> > obj-$(CONFIG_QUEUED_SPINLOCKS) += qspinlock.o
> > +ifeq ($(CONFIG_BPF_SYSCALL),y)
> > +obj-$(CONFIG_QUEUED_SPINLOCKS) += rqspinlock.o
> > +endif
> >
> > so that should give enough of a hint that it's for bpf usage.
> >
> >> As for the locking semantics allowed by the BPF verifier, is it possible
> >> to enforce the strict locking rules for PREEMPT_RT kernel and use the
> >> relaxed semantics for non-PREEMPT_RT kernel. We don't want the loading
> >> of an arbitrary BPF program to break the latency guarantee of a
> >> PREEMPT_RT kernel.
> > Not really.
> > root can load silly bpf progs that take significant
> > amount time without abusing spinlocks.
> > Like 100k integer divides or a sequence of thousands of calls to map_update.
> > Long runtime of broken progs is a known issue.
> > We're working on a runtime termination check/watchdog that
> > will detect long running progs and will terminate them.
> > Safe termination is tricky, as you can imagine.
>
> Right.
>
> In that case, we just have to warn users that they can load BPF prog at
> their own risk and PREEMPT_RT kernel may break its latency guarantee.
Let's not open this can of worms.
There will be a proper watchdog eventually.
If we start to warn, when do we warn? On any bpf program loaded?
How about classic BPF ? tcpdump and seccomp ? They are limited
to 4k instructions, but folks can abuse that too.