Re: [PATCH v2] kernel: bpf: syscall: fix a possible sleep-in-atomic bug in __bpf_prog_put()

From: Yonghong Song
Date: Tue May 30 2023 - 13:46:34 EST




On 5/30/23 12:06 AM, starmiku1207184332@xxxxxxxxx wrote:
From: Teng Qi <starmiku1207184332@xxxxxxxxx>

__bpf_prog_put() indirectly calls kvfree() through bpf_prog_put_deferred()
which is unsafe under atomic context. The current
condition ‘in_irq() || irqs_disabled()’ in __bpf_prog_put() to ensure safety
does not cover cases involving the spin lock region and rcu read lock region.
Since __bpf_prog_put() is called by various callers in kernel/, net/ and
drivers/, and potentially more in future, it is necessary to handle those
cases as well.

Although we haven`t found a proper way to identify the rcu read lock region,
we have noticed that vfree() calls vfree_atomic() with the
condition 'in_interrupt()' to ensure safety.

I would really like you to create a test case
to demonstrate with a rcu or spin-lock warnings based on existing code
base. With a test case, it would hard to see whether we need this
patch or not.


To make __bpf_prog_put() safe in practice, we propose calling
bpf_prog_put_deferred() with the condition 'in_interrupt()' and
using the work queue for any other context.

We also added a comment to indicate that the safety of __bpf_prog_put()
relies implicitly on the implementation of vfree().

Signed-off-by: Teng Qi <starmiku1207184332@xxxxxxxxx>
---
v2:
remove comments because of self explanatory of code.

Fixes: d809e134be7a ("bpf: Prepare bpf_prog_put() to be called from irq context.")

Please put 'Fixes' right before 'Signed-off-by' in the above.

---
kernel/bpf/syscall.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 14f39c1e573e..96658e5874be 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2099,7 +2099,7 @@ static void __bpf_prog_put(struct bpf_prog *prog)
struct bpf_prog_aux *aux = prog->aux;
if (atomic64_dec_and_test(&aux->refcnt)) {
- if (in_irq() || irqs_disabled()) {
+ if (!in_interrupt()) {

Could we have cases where in software context we have irqs_disabled()?

INIT_WORK(&aux->work, bpf_prog_put_deferred);
schedule_work(&aux->work);
} else {