Re: [PATCH net v1] bonding: fix type confusion in bond_setup_by_slave()
From: Jiayuan Chen
Date: Wed Mar 04 2026 - 05:16:14 EST
March 4, 2026 at 13:23, "Jay Vosburgh" <jv@xxxxxxxxxxxxx mailto:jv@xxxxxxxxxxxxx?to=%22Jay%20Vosburgh%22%20%3Cjv%40jvosburgh.net%3E > wrote:
>
> Jiayuan Chen <jiayuan.chen@xxxxxxxxx> wrote:
>
> >
> > From: Jiayuan Chen <jiayuan.chen@xxxxxxxxxx>
> >
[...]
> > bond_dev->header_ops = slave_dev->header_ops;
Hi Jay,
Thank you for reviewing this patch.
> This is true only for devices that are not ARPHRD_ETHER
> >
> > This causes a type confusion when dev_hard_header() is later called
> > on the bond device. Functions like ipgre_header(), ip6gre_header(),
> > vlan_dev_hard_header(), and macvlan_hard_header() all use
> > netdev_priv(dev) to access their device-specific private data. When
> > called with the bond device, netdev_priv() returns the bond's private
> > data (struct bonding) instead of the expected type (e.g. struct
> > ip_tunnel), leading to garbage values being read and kernel crashes.
> >
> I believe that vlan and macvlan are both ARPHRD_ETHER, and thus
> won't take the bond_setup_by_slave path (which was originally
> implemented for Infiniband).
> That said, your description for the GRE paths seems correct.
Thanks. You are right. vlan and macvlan are both ARPHRD_ETHER and go through
bond_ether_setup(), not bond_setup_by_slave(). I'll remove them from
the description in v2.
[...]
> > ip link add dummy0 type dummy
> > ip addr add 10.0.0.1/24 dev dummy0
> > ip link set dummy0 up
> > ip link add gre1 type gre local 10.0.0.1
> > ip link add bond1 type bond mode active-backup
> > ip link set gre1 master bond1
> > ip link set gre1 up
> > ip link set bond1 up
> > ip addr add fe80::1/64 dev bond1
> >
> > Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
> >
> Has something relevant changed more recently than this? I seem
> to recall hearing of folks using bonding with GRE more recently than the
> 2013 date on c54419321455. I didn't do an exhaustive search, but I feel
> like someone would have run into this in the prior 13-ish years if it's
> been broken the whole time.
Good question. Here is the full timeline:
1. 1da177e4c3f4 ("Linux-2.6.12-rc2", 2005) - ipgre_header() already
uses dev->priv (later netdev_priv(dev)), but bonding cannot
enslave non-Ethernet devices at all, so no issue.
2. 872254dd6b1f ("net/bonding: Enable bonding to enslave non
ARPHRD_ETHER", 2007-10-09) - bond_setup_by_slave() is introduced
for IPoIB support, but it does NOT copy hard_header or header_ops.
Still no issue.
3. 3b04ddde02cf ("[NET]: Move hardware header operations out of
netdevice.", 2007-10-09) - dev->hard_header is refactored into
dev->header_ops->create. GRE gets ipgre_header_ops. Still no issue
because bond doesn't copy header_ops yet.
4. 1284cd3a2b74 ("bonding: two small fixes for IPoIB support",
2007-10-15) - bond_setup_by_slave() starts copying header_ops from
slave to bond. This is the commit that introduces the flawed pattern.
The problem is not limited to the panic shown in the crash
log. The fundamental issue is that netdev_priv(bond_dev) returns a
pointer to struct bonding, which ipgre_header() interprets as struct
ip_tunnel. Fields like t->hlen and t->parms.iph contain whatever
garbage happens to be at those offsets in the bonding structure.
When t->hlen is abnormally large or negative, it triggers the BUG_ON in
pskb_expand_head(). But even without a panic, the corrupted values
would produce malformed packets with wrong IP headers and GRE flags.
So now I think 1284cd3a2b74 is more appropriate.
I also tested on kernel 5.10 and added printk to show dev->name, it was
still "bond1".
~# dmesg -c
[ 0.000000] Linux version 5.10.248+ (root@xxx) (gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, GNU ld (GNU Binutils for Ubuntu) 2.42) #4 SMP PREEMPT Wed Mar 46
[ 0.000000] Command line: console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0 rw tmpfs=20G kmsan.panic=0
[ 0.000000] KERNEL supported cpus:
......
[ 44.942850] bond1: (slave gre1): The slave device specified does not support setting the MAC address
[ 44.942859] bond1: (slave gre1): Setting fail_over_mac to active for active-backup mode
[ 44.944602] bond1: (slave gre1): making interface the new active one
[ 44.944697] bond1: (slave gre1): Enslaving as an active interface with an up link
[ 44.951456] 8021q: adding VLAN 0 to HW filter on device bond1
[ 44.969646] dev name bond1
[ 45.333625] dev name bond1
[ 45.941654] IPv6: ADDRCONF(NETDEV_CHANGE): bond1: link becomes ready
[ 45.953612] dev name bond1
[ 46.357629] dev name bond1
[ 46.549627] dev name bond1
[ 46.581614] dev name bond1
[ 47.381737] dev name bond1
[ 47.381749] dev name bond1
[ 47.489620] dev name bond1
[...]
> > +
> > +static const struct header_ops bond_header_ops = {
> > + .create = bond_header_create,
> > + .parse = bond_header_parse,
> > +};
> > +
> > static void bond_setup_by_slave(struct net_device *bond_dev,
> > struct net_device *slave_dev)
> > {
> > @@ -1516,7 +1554,8 @@ static void bond_setup_by_slave(struct net_device *bond_dev,
> >
> > dev_close(bond_dev);
> >
> > - bond_dev->header_ops = slave_dev->header_ops;
> > + bond_dev->header_ops = slave_dev->header_ops ?
> > + &bond_header_ops : NULL;
> >
> Can slave_dev->header_ops actually be NULL?
Yes. For example, a GRE tunnel created with a unicast remote
ip link add gre1 type gre local 10.0.0.1 remote 10.0.0.2
does not set header_ops (see ipgre_tunnel_init(), the iph->daddr && !tunnel->collect_md branch).
> -J
>
> >
> > bond_dev->type = slave_dev->type;
> > bond_dev->hard_header_len = slave_dev->hard_header_len;
> > --
> > 2.43.0
> >
> ---
> -Jay Vosburgh, jv@xxxxxxxxxxxxx
>