[Discuss]Why enable individual port in bonding 8023ad?

From: huyizhen2024
Date: Mon Nov 04 2024 - 06:35:34 EST


Why is individual aggregator's port enabled in function ad_agg_selection_logic ? I have found no basis for this in the IEEE 802.3ad standard.

In fact, I had the same problem as chengyechun <chengyechun1@xxxxxxxxxx> and Thomas Bogendoerfer <tbogendoerfer@xxxxxxx>.
https://lore.kernel.org/netdev/c464627d07434469b363134ad10e3b4c@xxxxxxxxxx/
https://lore.kernel.org/netdev/20240404114908.134034-1-tbogendoerfer@xxxxxxx/T/

I use port 1 and port 2 form a bond interface and use nftables to discard LACP packets received by port 1.

The bond configuration is as follows:
BONDING_OPTS='mode=4 miimon=100 lacp_rate=fast xmit_hash_policy=layer3+4'
TYPE=Bond
BONDING_MASTER=yes
BOOTPROTO=static
NM_CONTROLLED=no
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=bond0
DEVICE=bond0
ONBOOT=yes
IPADDR=1.1.1.38
NETMASK=255.255.0.0
IPV6ADDR=1:1:1::39/64

The slave configuration is as follows: and I have four similar slaves enp13s0
NAME=enp12s0
DEVICE=enp12s0
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
NM_CONTROLLED=no
MASTER=bond0
SLAVE=yes
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no

The nftables configuration is as follows:
# cat /etc/nftables.conf
table netdev filter {
chain ingress {
type filter hook ingress device enp13s0 priority 0; policy accept;
meta protocol 0x8809 drop
}
}
Then nft -f /etc/nftables.conf to apply this conf.

During aggregation, the time sequence is as follows:
1. When bond0 receives the NETDEV_PRE_UP event, port 1 chooses as the active LAG. Since port 1 has not received the LACPDU, port 1 is considered as an individual port and is enabled by __enable_port in function ad_agg_selection_logic.
[37.643701] bond0: bond_netdev_event received NETDEV_PRE_UP
[37.643740] bond0: (slave enp13s0): LAG 1 chosen as the active LAG
2. The MUX state machine of port 2 enters the AD_MUX_WAITING state.
[37.643763] bond0: (slave enp14s0): Mux Machine: Port=2, Last State=0, Curr State=1
[37.748705] bond0: (slave enp14s0): Mux Machine: Port=2, Last State=1, Curr State=2
3. Port 2 receives the LACPDU, since port 2 has partner but port 1 has no partner, port 2 is elected as the best aggregator by ad_agg_selection_logic, then __disable_ports port 1. Port 2 is not enabled just like port 1 because port 2 has partner. At the same time, the MUX state machine of port 2 is still in AD_MUX_WAITING (it takes about 2s AD_WAIT_WHILE_TIMER). At this time, the system does not have any enabled port (or usable slave).
[37.960715] bond0: (slave enp14s0): LAG 2 chosen as the active LAG
4. Two seconds later, the MUX state machine of port 2 enters the AD_MUX_COLLECTING_DISTRIBUTING state and enabled by ad_mux_machine. The system finally has an enabled port.
[39.976696] bond0: (slave enp14s0): Mux Machine: Port=2, Last State=2, Curr State=3
[40.084710] bond0: (slave enp14s0): Mux Machine: Port=2, Last State=3, Curr State=4

Within the range from [37.960715] to [40.084710], the system does not have any available port. The bond_xmit_3ad_xor_slave_get cannot obtain an available slave port. The bond_3ad_xor_xmit return drop, and the bond port cannot send packets.

But if port 2 does not receive LACPDU, then almost all the time the bond can send packets on port 1. (except that the MUX state machine of port 1 changes from AD_MUX_ATTACHED to AD_MUX_COLLECTING_DISTRIBUTING, which is about 100 ms)
In the scenario where port 1 cannot receive LACPDUs and port 2 cannot receive LACPDUs, the behavior of the bond interface should be the same. That is, whatever port 1 or port 2 cannot receive LACPDU, packets cannot be transmitted within an equal time, 2s or 100ms. But Port 1 cannot receive LACPDUs leads to 2s packet loss and port 2 cannot receive LACPDUs leads to 100 ms packet loss.
Therefore, it is critical to understand why the individuals aggregator's port is enabled in the function ad_agg_selection_logic. The status of the MUX state machine of port 1 is not verified. Actually, when port 1 is enabled, the MUX state machine of port 1 is not processed. According to the IEEE 802.3ad standard, the status of port 1 should be disabled.