[PATCH RT 09/12] bpf: Remove in_atomic() from bpf_link_put().

From: Joseph Salisbury
Date: Wed Oct 18 2023 - 15:49:30 EST


From: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>

v5.15.133-rt70-rc1 stable review patch.
If anyone has any objections, please let me know.

-----------


bpf_free_inode() is invoked as a RCU callback. Usually RCU callbacks are
invoked within softirq context. By setting rcutree.use_softirq=0 boot
option the RCU callbacks will be invoked in a per-CPU kthread with
bottom halves disabled which implies a RCU read section.

On PREEMPT_RT the context remains fully preemptible. The RCU read
section however does not allow schedule() invocation. The latter happens
in mutex_lock() performed by bpf_trampoline_unlink_prog() originated
from bpf_link_put().

It was pointed out that the bpf_link_put() invocation should not be
delayed if originated from close(). It was also pointed out that other
invocations from within a syscall should also avoid the workqueue.
Everyone else should use workqueue by default to remain safe in the
future (while auditing the code, every caller was preemptible except for
the RCU case).

Let bpf_link_put() use the worker unconditionally. Add
bpf_link_put_direct() which will directly free the resources and is used
by close() and from within __sys_bpf().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
Link: https://lore.kernel.org/bpf/20230614083430.oENawF8f@xxxxxxxxxxxxx
(cherry picked from commit ab5d47bd41b1db82c295b0e751e2b822b43a4b5a)
Signed-off-by: Joseph Salisbury <joseph.salisbury@xxxxxxxxxxxxx>
---
kernel/bpf/syscall.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ad41b8230780..bcc01f9881cf 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2454,27 +2454,30 @@ static void bpf_link_put_deferred(struct work_struct *work)
bpf_link_free(link);
}

-/* bpf_link_put can be called from atomic context, but ensures that resources
- * are freed from process context
+/* bpf_link_put might be called from atomic context. It needs to be called
+ * from sleepable context in order to acquire sleeping locks during the process.
*/
void bpf_link_put(struct bpf_link *link)
{
if (!atomic64_dec_and_test(&link->refcnt))
return;

- if (in_atomic()) {
- INIT_WORK(&link->work, bpf_link_put_deferred);
- schedule_work(&link->work);
- } else {
- bpf_link_free(link);
- }
+ INIT_WORK(&link->work, bpf_link_put_deferred);
+ schedule_work(&link->work);
+}
+
+static void bpf_link_put_direct(struct bpf_link *link)
+{
+ if (!atomic64_dec_and_test(&link->refcnt))
+ return;
+ bpf_link_free(link);
}

static int bpf_link_release(struct inode *inode, struct file *filp)
{
struct bpf_link *link = filp->private_data;

- bpf_link_put(link);
+ bpf_link_put_direct(link);
return 0;
}

@@ -4351,7 +4354,7 @@ static int link_update(union bpf_attr *attr)
if (ret)
bpf_prog_put(new_prog);
out_put_link:
- bpf_link_put(link);
+ bpf_link_put_direct(link);
return ret;
}

@@ -4374,7 +4377,7 @@ static int link_detach(union bpf_attr *attr)
else
ret = -EOPNOTSUPP;

- bpf_link_put(link);
+ bpf_link_put_direct(link);
return ret;
}

@@ -4425,7 +4428,7 @@ static int bpf_link_get_fd_by_id(const union bpf_attr *attr)

fd = bpf_link_new_fd(link);
if (fd < 0)
- bpf_link_put(link);
+ bpf_link_put_direct(link);

return fd;
}
@@ -4502,7 +4505,7 @@ static int bpf_iter_create(union bpf_attr *attr)
return PTR_ERR(link);

err = bpf_iter_new_fd(link);
- bpf_link_put(link);
+ bpf_link_put_direct(link);

return err;
}
--
2.34.1