Re: [RFC PATCH 0/4] trace, livepatch: Allow kprobe return overriding for livepatched functions

From: Yafang Shao

Date: Mon Apr 06 2026 - 06:56:09 EST

On Sat, Apr 4, 2026 at 12:07 AM Song Liu <song@xxxxxxxxxx> wrote:
>
> Hi Yafang,
>
> On Thu, Apr 2, 2026 at 2:26 AM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> >
> > Livepatching allows for rapid experimentation with new kernel features
> > without interrupting production workloads. However, static livepatches lack
> > the flexibility required to tune features based on task-specific attributes,
> > such as cgroup membership, which is critical in multi-tenant k8s
> > environments. Furthermore, hardcoding logic into a livepatch prevents
> > dynamic adjustments based on the runtime environment.
> >
> > To address this, we propose a hybrid approach using BPF. Our production use
> > case involves:
> >
> > 1. Deploying a Livepatch function to serve as a stable BPF hook.
> >
> > 2. Utilizing bpf_override_return() to dynamically modify the return value
> > of that hook based on the current task's context.
>
> Could you please provide a specific use case that can benefit from this?
> AFAICT, livepatch is more flexible but risky (may cause crash); while
> BPF is safe, but less flexible. The combination you are proposing seems
> to get the worse of the two sides. Maybe it can indeed get the benefit of
> both sides in some cases, but I cannot think of such examples.
>

Here is an example we recently deployed on our production servers:

https://lore.kernel.org/bpf/CALOAHbDnNba_w_nWH3-S9GAXw0+VKuLTh1gy5hy9Yqgeo4C0iA@xxxxxxxxxxxxxx/

In one of our specific clusters, we needed to send BGP traffic out
through specific NICs based on the destination IP. To achieve this
without interrupting service, we live-patched
bond_xmit_3ad_xor_slave_get(), added a new hook called
bond_get_slave_hook(), and then ran a BPF program attached to that
hook to select the outgoing NIC from the SKB. This allowed us to
rapidly deploy the feature with zero downtime.

--
Regards
Yafang