[PATCH net] net: bonding: fix bond_xmit_broadcast return value error bug

From: Guangbin Huang
Date: Wed Jan 12 2022 - 07:59:25 EST


From: Jie Wang <wangjie125@xxxxxxxxxx>

In Linux bonding scenario, one packet is copied to several copies and sent
by all slave device of bond0 in mode 3(broadcast mode). The mode 3 xmit
function bond_xmit_broadcast() only ueses the last slave device's tx result
as the final result. In this case, if the last slave device is down, then
it always return NET_XMIT_DROP, even though the other slave devices xmit
success. It may cause the tx statistics error, and cause the application
(e.g. scp) consider the network is unreachable.

For example, use the following command to configure server A.

echo 3 > /sys/class/net/bond0/bonding/mode
ifconfig bond0 up
ifenslave bond0 eth0 eth1
ifconfig bond0 192.168.1.125
ifconfig eth0 up
ifconfig eth1 down
The slave device eth0 and eth1 are connected to server B(192.168.1.107).
Run the ping 192.168.1.107 -c 3 -i 0.2 command, the following information
is displayed.

PING 192.168.1.107 (192.168.1.107) 56(84) bytes of data.
64 bytes from 192.168.1.107: icmp_seq=1 ttl=64 time=0.077 ms
64 bytes from 192.168.1.107: icmp_seq=2 ttl=64 time=0.056 ms
64 bytes from 192.168.1.107: icmp_seq=3 ttl=64 time=0.051 ms

192.168.1.107 ping statistics
0 packets transmitted, 3 received

Actually, the slave device eth0 of the bond successfully sends three
ICMP packets, but the result shows that 0 packets are transmitted.

Also if we use scp command to get remote files, the command end with the
following printings.

ssh_exchange_identification: read: Connection timed out

So this patch modifies the bond_xmit_broadcast to return NET_XMIT_SUCCESS
if one slave device in the bond sends packets successfully. If all slave
devices send packets fail, the discarded packets stats is increased. The
skb is released when there is no slave device in the bond or the last slave
device is down.

Fixes: ae46f184bc1f ("bonding: propagate transmit status")
Signed-off-by: Jie Wang <wangjie125@xxxxxxxxxx>
Signed-off-by: Guangbin Huang <huangguangbin2@xxxxxxxxxx>
---
drivers/net/bonding/bond_main.c | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 07fc603c2fa7..fce80b57f15b 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4884,25 +4884,39 @@ static netdev_tx_t bond_xmit_broadcast(struct sk_buff *skb,
struct bonding *bond = netdev_priv(bond_dev);
struct slave *slave = NULL;
struct list_head *iter;
+ bool xmit_suc = false;
+ bool skb_used = false;

bond_for_each_slave_rcu(bond, slave, iter) {
- if (bond_is_last_slave(bond, slave))
- break;
- if (bond_slave_is_up(slave) && slave->link == BOND_LINK_UP) {
- struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC);
+ struct sk_buff *skb2;
+
+ if (!(bond_slave_is_up(slave) && slave->link == BOND_LINK_UP))
+ continue;

+ if (bond_is_last_slave(bond, slave)) {
+ skb2 = skb;
+ skb_used = true;
+ } else {
+ skb2 = skb_clone(skb, GFP_ATOMIC);
if (!skb2) {
net_err_ratelimited("%s: Error: %s: skb_clone() failed\n",
bond_dev->name, __func__);
continue;
}
- bond_dev_queue_xmit(bond, skb2, slave->dev);
}
+
+ if (bond_dev_queue_xmit(bond, skb2, slave->dev) == NETDEV_TX_OK)
+ xmit_suc = true;
}
- if (slave && bond_slave_is_up(slave) && slave->link == BOND_LINK_UP)
- return bond_dev_queue_xmit(bond, skb, slave->dev);

- return bond_tx_drop(bond_dev, skb);
+ if (!skb_used)
+ dev_kfree_skb_any(skb);
+
+ if (xmit_suc)
+ return NETDEV_TX_OK;
+
+ atomic_long_inc(&bond_dev->tx_dropped);
+ return NET_XMIT_DROP;
}

/*------------------------- Device initialization ---------------------------*/
--
2.33.0