Re: [PATCH net v4 1/2] bonding: fix null-ptr-deref in bond_rr_gen_slave_id()
From: Jiayuan Chen
Date: Wed Mar 04 2026 - 03:48:14 EST
March 4, 2026 at 16:20, "Daniel Borkmann" <daniel@xxxxxxxxxxxxx mailto:daniel@xxxxxxxxxxxxx?to=%22Daniel%20Borkmann%22%20%3Cdaniel%40iogearbox.net%3E > wrote:
>
> On 3/4/26 8:42 AM, Jiayuan Chen wrote:
>
> >
> > From: Jiayuan Chen <jiayuan.chen@xxxxxxxxxx>
> > bond_rr_gen_slave_id() dereferences bond->rr_tx_counter without a NULL
> > check. rr_tx_counter is a per-CPU counter only allocated in bond_open()
> > when the bond mode is round-robin. If the bond device was never brought
> > up, rr_tx_counter remains NULL, causing a null-ptr-deref.
> > The XDP redirect path can reach this code even when the bond is not up:
> > bpf_master_redirect_enabled_key is a global static key, so when any bond
> > device has native XDP attached, the XDP_TX -> xdp_master_redirect()
> > interception is enabled for all bond slaves system-wide. This allows the
> > path xdp_master_redirect() -> bond_xdp_get_xmit_slave() ->
> > bond_xdp_xmit_roundrobin_slave_get() -> bond_rr_gen_slave_id() to be
> > reached on a bond that was never opened.
> > Fix this by allocating rr_tx_counter unconditionally in bond_init()
> > (ndo_init), which is called by register_netdevice() and covers both
> > device creation paths (bond_create() and bond_newlink()). This also
> > handles the case where bond mode is changed to round-robin after device
> > creation. The conditional allocation in bond_open() is removed. Since
> > bond_destructor() already unconditionally calls
> > free_percpu(bond->rr_tx_counter), the lifecycle is clean: allocate at
> > ndo_init, free at destructor.
> > Note: rr_tx_counter is only used by round-robin mode, so this
> > deliberately allocates a per-cpu u32 that goes unused for other modes.
> > Conditional allocation (e.g., in bond_option_mode_set) was considered
> > but rejected: the XDP path can race with mode changes on a downed bond,
> > and adding memory barriers to the XDP hot path is not justified for
> > saving 4 bytes per CPU.
> >
> Arguably it's a corner case, but could we not just do sth like this to
> actually check if the device is up and if not drop?
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index ba019ded773d..c447fd989a27 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -4387,6 +4387,9 @@ u32 xdp_master_redirect(struct xdp_buff *xdp)
> struct net_device *master, *slave;
> master = netdev_master_upper_dev_get_rcu(xdp->rxq->dev);
> + if (unlikely(!(master->flags & IFF_UP)))
> + return XDP_ABORTED;
> +
> slave = master->netdev_ops->ndo_xdp_get_xmit_slave(master, xdp);
> if (slave && slave != xdp->rxq->dev) {
> /* The target device is different from the receiving device, so
>
Hi Daniel,
It was discussed at [1].
The primary concern at the time was that assuming
bond->rr_tx_counter had been allocated based on the
interface being up was too implicit.
[1] https://lore.kernel.org/netdev/20260224112545.37888-1-jiayuan.chen@xxxxxxxxx/T/#t
Feel free to share any other thoughts or ideas
Thanks,