Re: [PATCH bpf-next v1 12/22] rqspinlock: Add basic support for CONFIG_PARAVIRT

From: Waiman Long
Date: Wed Jan 08 2025 - 22:46:20 EST


On 1/8/25 10:37 PM, Alexei Starovoitov wrote:
On Wed, Jan 8, 2025 at 6:58 PM Waiman Long <llong@xxxxxxxxxx> wrote:

On 1/8/25 9:42 PM, Alexei Starovoitov wrote:
On Wed, Jan 8, 2025 at 4:48 PM Waiman Long <llong@xxxxxxxxxx> wrote:
Is the intention to only replace raw_spinlock_t by rqspinlock but never
spinlock_t?
Correct. We brainstormed whether we can introduce resilient mutex
for sleepable context, but it's way out of scope and PI
considerations are too complex to think through.
rqspinlock is a spinning lock, so it's a replacement for raw_spin_lock
and really only for bpf use cases.
Thank for the confirmation. I think we should document the fact that
rqspinlock is a replacement for raw_spin_lock only in the rqspinlock.c
file to prevent possible abuse in the future.
Agreed.

We considered placing rqspinlock.c in kernel/bpf/ directory
to discourage any other use beyond bpf,
but decided to keep in kernel/locking/ only because
it's using mcs_spinlock.h and qspinlock_stat.h
and doing #include "../locking/mcs_spinlock.h"
is kinda ugly.

Patch 16 does:
+++ b/kernel/locking/Makefile
@@ -24,6 +24,9 @@ obj-$(CONFIG_SMP) += spinlock.o
obj-$(CONFIG_LOCK_SPIN_ON_OWNER) += osq_lock.o
obj-$(CONFIG_PROVE_LOCKING) += spinlock.o
obj-$(CONFIG_QUEUED_SPINLOCKS) += qspinlock.o
+ifeq ($(CONFIG_BPF_SYSCALL),y)
+obj-$(CONFIG_QUEUED_SPINLOCKS) += rqspinlock.o
+endif

so that should give enough of a hint that it's for bpf usage.

As for the locking semantics allowed by the BPF verifier, is it possible
to enforce the strict locking rules for PREEMPT_RT kernel and use the
relaxed semantics for non-PREEMPT_RT kernel. We don't want the loading
of an arbitrary BPF program to break the latency guarantee of a
PREEMPT_RT kernel.
Not really.
root can load silly bpf progs that take significant
amount time without abusing spinlocks.
Like 100k integer divides or a sequence of thousands of calls to map_update.
Long runtime of broken progs is a known issue.
We're working on a runtime termination check/watchdog that
will detect long running progs and will terminate them.
Safe termination is tricky, as you can imagine.
Right.

In that case, we just have to warn users that they can load BPF prog at
their own risk and PREEMPT_RT kernel may break its latency guarantee.
Let's not open this can of worms.
There will be a proper watchdog eventually.
If we start to warn, when do we warn? On any bpf program loaded?
How about classic BPF ? tcpdump and seccomp ? They are limited
to 4k instructions, but folks can abuse that too.

My intention is to document this somewhere, not to print out a warning in the kernel dmesg log.

Cheers,
Longman