Re: [PATCHv3 net 1/3] bonding: set AD_RX_PORT_DISABLED when disabling a port
From: Hangbin Liu
Date: Thu Feb 26 2026 - 21:31:18 EST
On Thu, Feb 26, 2026 at 05:16:55PM -0800, Jay Vosburgh wrote:
> Hangbin Liu <liuhangbin@xxxxxxxxx> wrote:
>
> >When disabling a port’s collecting and distributing states, updating only
> >rx_disabled is not sufficient. We also need to set AD_RX_PORT_DISABLED
> >so that the rx_machine transitions into the AD_RX_EXPIRED state.
> >
> >One example is in ad_agg_selection_logic(): when a new aggregator is
> >selected and old active aggregator is disabled, if AD_RX_PORT_DISABLED is
> >not set, the disabled port may remain stuck in AD_RX_CURRENT due to
> >continuing to receive partner LACP messages.
>
> I'm not sure I'm seeing the problem here, is there an actual
> misbehavior being fixed here? The port is receiving LACPDUs, and from
> the receive state machine point of view (Figure 6-18) there's no issue.
> The "port_enabled" variable (6.4.7) also informs the state machine
> behavior, but that's not the same as what's changed by bonding's
> __disable_port function.
Yes, the reason I do it here is we select another aggregator and called
__disable_port() for the old one. If we don't update sm_rx_state, the port
will be keep in collecting/distributing state, and the partner will also
keep in the c/d state.
Here we entered a logical paradox, on one hand we want to disable the port,
on the other hand we keep the port in collecting/distributing state.
>
> Where I'm going with this is that, when multiple aggregator
> support was originally implemented, the theory was to keep aggregators
> other than the active agg in a state such that they could be put into
> service immediately, without having to do LACPDU exchanges in order to
> transition into the appropriate state. A hot standby, basically,
> analogous to an active-backup mode backup interface with link state up.
This sounds good. But without LACPDU exchange, the hot standby actor and
partner should be in collecting/distributing state. What should we do when
partner start send packets to us?
>
> I haven't tested this in some time, though, so my question is
> whether this change affects the failover time when an active aggregator
> is de-selected in favor of another aggregator. By "failover time," I
> mean how long transmission and/or reception are interrupted when
> changing from one aggregator to another. I presume that if aggregator
> failover ater this change requires LACPDU exchanges, etc, it will take
> longer to fail over.
I haven't tested it yet. I think the failover time should be in 1 second.
Let me do some testing today.
Thanks
Hangbin