Re: [bug] kernel: bpf: syscall: a possible sleep-in-atomic bug in __bpf_prog_put()

From: Yonghong Song
Date: Thu May 25 2023 - 01:38:02 EST




On 5/24/23 12:44 PM, Alexei Starovoitov wrote:
On Wed, May 24, 2023 at 12:34 PM Yonghong Song <yhs@xxxxxxxx> wrote:



On 5/24/23 5:42 AM, Teng Qi wrote:
Thank you.

We cannot use rcu_read_lock_held() in the 'if' statement. The return
value rcu_read_lock_held() could be 1 for some configurations regardless
whether rcu_read_lock() is really held or not. In most cases,
rcu_read_lock_held() is used in issuing potential warnings.
Maybe there are other ways to record whether rcu_read_lock() is held or not?

Sorry. I was not aware of the dependency of configurations of
rcu_read_lock_held().

If we cannot resolve rcu_read_lock() presence issue, maybe the condition
can be !in_interrupt(), so any process-context will go to a workqueue.

I agree that using !in_interrupt() as a condition is an acceptable solution.

This should work although it could be conservative.


Alternatively, we could have another solution. We could add another
function e.g., bpf_prog_put_rcu(), which indicates that bpf_prog_put()
will be done in rcu context.

Implementing a new function like bpf_prog_put_rcu() is a solution that involves
more significant changes.

Maybe we can change signature of bpf_prog_put instead? Like
void bpf_prog_put(struct bpf_prog *prog, bool in_rcu)
and inside bpf_prog_put we can add
WARN_ON_ONCE(in_rcu && !bpf_rcu_lock_held());

bpf_rcu_lock_held() is used for different cases.

Sorry, I actually mean rcu_read_lock_held() ...


Here s/in_irq/in_interrupt/ inside bpf_prog_put() is enough
to address this theoretical issue.

Maybe

if (!in_interrupt()) {
INIT_WORK(&aux->work, bpf_prog_put_deferred);
schedule_work(&aux->work);
} else {
bpf_prog_put_deferred(&aux->work);
}
?

Basically for any process context, use a work queue since
we have no idea whether rcu_read_lock() is held or not.
In process context, is_atmoc() and irqs_disabled() should
already use the work queue.
As we discussed in the above, if in_interrupt() is true,
kvfree seems okay, so can directly call
bpf_prog_put_deferred().
Does this sound reasonable?