Re: [PATCH v2 bpf-next 3/5] ftrace: introduce FTRACE_OPS_FL_SHARE_IPMODIFY

From: Steven Rostedt
Date: Fri Jul 15 2022 - 16:00:06 EST


On Fri, 15 Jul 2022 19:49:00 +0000
Song Liu <songliubraving@xxxxxx> wrote:

> >
> > What about if we release the lock when doing the callback?
>
> We can probably unlock ftrace_lock here. But we may break locking order
> with direct mutex (see below).

You're talking about the multi registering case, right?

>
> >
> > Then we just need to make sure things are the same after reacquiring the
> > lock, and if they are different, we release the lock again and do the
> > callback with the new update. Wash, rinse, repeat, until the state is the
> > same before and after the callback with locks acquired?
>
> Personally, I would like to avoid wash-rinse-repeat here.

But it's common to do. Keeps your hair cleaner that way ;-)

>
> >
> > This is a common way to handle callbacks that need to do something that
> > takes the lock held before doing a callback.
> >
> > The reason I say this, is because the more we can keep the accounting
> > inside of ftrace the better.
> >
> > Wouldn't this need to be done anyway if BPF was first and live kernel
> > patching needed the update? An -EAGAIN would not suffice.
>
> prepare_direct_functions_for_ipmodify handles BPF-first-livepatch-later
> case. The benefit of prepare_direct_functions_for_ipmodify() is that it
> holds direct_mutex before ftrace_lock, and keeps holding it if necessary.
> This is enough to make sure we don't need the wash-rinse-repeat.
>
> OTOH, if we wait until __ftrace_hash_update_ipmodify(), we already hold
> ftrace_lock, but not direct_mutex. To make changes to bpf trampoline, we
> have to unlock ftrace_lock and lock direct_mutex to avoid deadlock.
> However, this means we will need the wash-rinse-repeat.
>
>
> For livepatch-first-BPF-later case, we can probably handle this in
> __ftrace_hash_update_ipmodify(), since we hold both direct_mutex and
> ftrace_lock. We can unlock ftrace_lock and update the BPF trampoline.
> It is safe against changes to direct ops, because we are still holding
> direct_mutex. But, is this safe against another IPMODIFY ops? I am not
> sure yet... Also, this is pretty weird because, we are updating a
> direct trampoline before we finish registering it for the first time.
> IOW, we are calling modify_ftrace_direct_multi_nolock for the same
> trampoline before register_ftrace_direct_multi() returns.
>
> The approach in v2 propagates the -EAGAIN to BPF side, so these are two
> independent calls of register_ftrace_direct_multi(). This does require
> some protocol between ftrace core and its user, but I still think this
> is a cleaner approach.

The issue I have with this approach is it couples BPF and ftrace a bit too
much.

But there is a way with my approach you can still do your approach. That
is, have ops_func() return zero if everything is fine, and otherwise returns
a negative value. Then have the register function fail and return whatever
value that gets returned by the ops_func()

Then have the bpf ops_func() check (does this direct caller handle
IPMODIFY? if yes, return 0, else return -EAGAIN). Then the registering of
ftrace fails with your -EAGAIN, and then you can change the direct
trampoline to handle IPMODIFY and try again. This time when ops_func() is
called, it sees that the direct trampoline can handle the IPMODIFY and
returns 0.

Basically, it's a way to still implement my suggestion, but let BPF decide
to use -EAGAIN to try again. And then BPF and ftrace don't need to have
these special flags to change the behavior of each other.

-- Steve