Re: [PATCH next] bonding: Extending LACP MUX State Machine to include a Collecting State.

From: David Dillow
Date: Fri Dec 22 2023 - 01:28:20 EST


> I haven't read the patch in detail yet, but my overall question
> is: why do we need this? This adds significant complexity to the
> state machine logic. What real problem is this solving, i.e., what
> examples do you have of systems where a port is "in a state where
> it can receive incoming packets while not still distributing"?

Any time we add a new link to an aggregator, or the bond selects a new
aggegrator based on the selection policy, there is currently a race
where we start distributing traffic before our partner (usually a
switch) is ready to start collecting it, leading to dropped packets if
we're running traffic over the bond. We reliably hit this window,
making what should be a non-issue into a customer-visible packet-loss
event. Implementing the full state machine closes the window and makes
these maintenance events lossless.