Re: [PATCHv2 net] bonding: use permanent address for MAC swapping if device address is same

From: Jay Vosburgh
Date: Fri Apr 04 2025 - 17:36:55 EST


Hangbin Liu <liuhangbin@xxxxxxxxx> wrote:

>Similar with a951bc1e6ba5 ("bonding: correct the MAC address for "follow"
>fail_over_mac policy"). The fail_over_mac follow mode requires the formerly
>active slave to swap MAC addresses with the newly active slave during
>failover. However, the slave's MAC address can be same under certain
>conditions:
>
>1) ip link set eth0 master bond0
> bond0 adopts eth0's MAC address (MAC0).
>
>1) ip link set eth1 master bond0
> eth1 is added as a backup with its own MAC (MAC1).
>
>3) ip link set eth0 nomaster
> eth0 is released and restores its MAC (MAC0).
> eth1 becomes the active slave, and bond0 assigns MAC0 to eth1.

This step leaves both the bond+eth1 and the independent eth0
using the same MAC address. There is a warning printed for this, and
allowing the duplicated MAC address assignment has been the behavior for
a very long time, and to my knowledge hasn't caused issues (I presume
because swapping interfaces in and out of a bond willy nilly doesn't
happen much outside of test cases).

[ pasting in Paolo's comment from the other thread ]

Paolo says:
>It was not immediately clear to me that the mac-dance in the code below
>happens only at failover time.

"Failover" may be a misnomer here; the dance happens when the
active interface changes, which for this scenario occurs at either a
failover (link went down, etc) or when the active interface is removed
from the bond.

Paolo says:
>I second Jakub's doubt, I think it would be better to change eth0 mac
>address here (possibly to permanent eth1 mac, to preserve some consistency?)

This would cause the MAC of the bond itself to change (for the
interface removal case at issue here), which is not the current behavior
for fail_over_mac=follow. I'm not sure how often that's a dependency in
practice, but it would be a change of long-standing behavior.

Paolo says:
>Doing that in ndo_del_slave() should allow bonding to change the mac
>while still owning the old slave and avoid races with user-space.
>
>WDYT?

[ end of Paolo's comment ]

>4) ip link set eth0 master bond0
> eth0 is re-added to bond0, but both eth0 and eth1 now have MAC0,
> breaking the follow policy.
>
>To resolve this issue, we need to swap the new active slave’s permanent
>MAC address with the old one. The new active slave then uses the old
>dev_addr, ensuring that it matches the bond address. After the fix:

Which interface is the "new active" in this situation? Adding
eth0 back into the bond should not cause a change of active, eth0 would
be added as a backup.

>5) ip link set bond0 type bond active_slave eth0
> dev_addr is the same, swap old active eth1's MAC (MAC0) with eth0.
> Swap new active eth0's permanent MAC (MAC0) to eth1.
> MAC addresses remain unchanged.

So this patch's change wouldn't actually resolve the MAC
conflict until a failover takes place? I.e., if we only do step 4 but
not step 5 or 6, eth0 and eth1 will both have the same MAC address. Am
I understanding correctly?

-J

>6) ip link set bond0 type bond active_slave eth1
> dev_addr is the same, swap the old active eth0's MAC (MAC0) with eth1.
> Swap new active eth1's permanent MAC (MAC1) to eth0.
> The MAC addresses are now correctly differentiated.
>
>Fixes: 3915c1e8634a ("bonding: Add "follow" option to fail_over_mac")
>Signed-off-by: Hangbin Liu <liuhangbin@xxxxxxxxx>
>---
>v2: use memcmp directly instead of adding a redundant helper (Jakub Kicinski)
>---
> drivers/net/bonding/bond_main.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index e45bba240cbc..1e343d8fafa0 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1107,8 +1107,13 @@ static void bond_do_fail_over_mac(struct bonding *bond,
> old_active = bond_get_old_active(bond, new_active);
>
> if (old_active) {
>- bond_hw_addr_copy(tmp_mac, new_active->dev->dev_addr,
>- new_active->dev->addr_len);
>+ if (memcmp(old_active->dev->dev_addr, new_active->dev->dev_addr,
>+ new_active->dev->addr_len) == 0)
>+ bond_hw_addr_copy(tmp_mac, new_active->perm_hwaddr,
>+ new_active->dev->addr_len);
>+ else
>+ bond_hw_addr_copy(tmp_mac, new_active->dev->dev_addr,
>+ new_active->dev->addr_len);
> bond_hw_addr_copy(ss.__data,
> old_active->dev->dev_addr,
> old_active->dev->addr_len);
>--
>2.46.0
>

---
-Jay Vosburgh, jv@xxxxxxxxxxxxx