Re: [PATCH bpf v2] bpf: Run generic devmap egress prog on private skb

From: Toke Høiland-Jørgensen

Date: Wed Jun 10 2026 - 07:03:05 EST


Sun Jian <sun.jian.kdev@xxxxxxxxx> writes:

> Generic XDP devmap multi redirect uses skb_clone() for the
> intermediate destinations and sends the last destination with the
> original skb. This can leave multiple destinations sharing the same
> packet data.
>
> This becomes visible when a devmap egress program mutates packet data.
> One destination can observe changes made for another destination. The
> last-destination path has the same problem: the last destination runs on
> the original skb, so its egress program can modify packet data still
> shared with earlier cloned skbs.
>
> Native XDP broadcast redirect does not have this issue because
> xdpf_clone() copies the frame data for each destination. Generic XDP
> should provide the same per-destination isolation before running a
> devmap egress program.
>
> Fix this by making cloned skbs private in dev_map_generic_redirect()
> before running the devmap egress program. Use skb_copy() instead of
> skb_unshare() so that allocation failure does not consume the skb and
> the existing caller error paths keep their ownership semantics.
>
> Add a selftest that covers the last-destination case where earlier
> destinations do not have a devmap egress program, while the final
> destination does.
>
> Tested with:
> ./test_progs -t xdp_veth_egress
> ./test_progs -t xdp_veth
> ./test_progs -t xdp
>
> Fixes: e624d4ed4aa8 ("xdp: Extend xdp_redirect_map with broadcast support")
> Suggested-by: Jiayuan Chen <jiayuan.chen@xxxxxxxxx>
> Signed-off-by: Sun Jian <sun.jian.kdev@xxxxxxxxx>

With a few nits (see below):

Reviewed-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>

> ---
> v1: https://lore.kernel.org/bpf/CABFUUZFimdrZdq=NWi+N-0sJZWvMwY=f4iF6-3TVMS8=m07Zmw@xxxxxxxxxxxxxx/
>
> Changes in v2:
> - Move the private-copy step into dev_map_generic_redirect() so the
> last-destination path is covered as well.
> - Use skb_copy() instead of skb_unshare() to keep caller ownership
> unchanged on allocation failure.
> - Add a generic XDP last-destination selftest case.
>
> kernel/bpf/devmap.c | 10 ++
> .../selftests/bpf/prog_tests/test_xdp_veth.c | 151 +++++++++++++++++-
> 2 files changed, 158 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
> index cc0a43ebab6b..59f267685bc6 100644
> --- a/kernel/bpf/devmap.c
> +++ b/kernel/bpf/devmap.c
> @@ -700,12 +700,22 @@ int dev_map_enqueue_multi(struct xdp_frame *xdpf, struct net_device *dev_rx,
> int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb,
> const struct bpf_prog *xdp_prog)
> {
> + struct sk_buff *nskb;

nit: this definition could go inside the if statement block below, to
make it obvious that nskb is not used outside that branch.

> int err;
>
> err = xdp_ok_fwd_dev(dst->dev, skb->len);
> if (unlikely(err))
> return err;
>
> + if (dst->xdp_prog && skb_cloned(skb)) {
> + nskb = skb_copy(skb, GFP_ATOMIC);
> + if (!nskb)
> + return -ENOMEM;
> +
> + consume_skb(skb);
> + skb = nskb;
> + }
> +
> /* Redirect has already succeeded semantically at this point, so we just
> * return 0 even if packet is dropped. Helper below takes care of
> * freeing skb.
> diff --git a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c
> index 3e98a1665936..1f0b9ade12fe 100644
> --- a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c
> +++ b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c

It's customary to split changes to the selftests into their own commits.

-Toke