Re: [PATCH v3 net-next 00/12] Better support for sandwiched LAGs with bridge and DSA

From: Nikolay Aleksandrov
Date: Mon Mar 22 2021 - 12:28:24 EST


On 21/03/2021 00:34, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@xxxxxxx>
>
> The objective of this series is to make LAG uppers on top of switchdev
> ports work regardless of which order we link interfaces to their masters
> (first make the port join the LAG, then the LAG join the bridge, or the
> other way around).
>
> There was a design decision to be made in patches 2-4 on whether we
> should adopt the "push" model (which attempts to solve the problem
> centrally, in the bridge layer) where the driver just calls:
>
> switchdev_bridge_port_offloaded(brport_dev,
> &atomic_notifier_block,
> &blocking_notifier_block,
> extack);
>
> and the bridge just replays the entire collection of switchdev port
> attributes and objects that it has, in some predefined order and with
> some predefined error handling logic;
>
>
> or the "pull" model (which attempts to solve the problem by giving the
> driver the rope to hang itself), where the driver, apart from calling:
>
> switchdev_bridge_port_offloaded(brport_dev, extack);
>
> has the task of "dumpster diving" (as Tobias puts it) through the bridge
> attributes and objects by itself, by calling:
>
> - br_vlan_replay
> - br_fdb_replay
> - br_mdb_replay
> - br_vlan_enabled
> - br_port_flag_is_set
> - br_port_get_stp_state
> - br_multicast_router
> - br_get_ageing_time
>
> (not necessarily all of them, and not necessarily in this order, and
> with driver-defined error handling).
>
> Even though I'm not in love myself with the "pull" model, I chose it
> because there is a fundamental trick with replaying switchdev events
> like this:
>
> ip link add br0 type bridge
> ip link add bond0 type bond
> ip link set bond0 master br0
> ip link set swp0 master bond0 <- this will replay the objects once for
> the bond0 bridge port, and the swp0
> switchdev port will process them
> ip link set swp1 master bond0 <- this will replay the objects again for
> the bond0 bridge port, and the swp1
> switchdev port will see them, but swp0
> will see them for the second time now
>
> Basically I believe that it is implementation defined whether the driver
> wants to error out on switchdev objects seen twice on a port, and the
> bridge should not enforce a certain model for that. For example, for FDB
> entries added to a bonding interface, the underling switchdev driver
> might have an abstraction for just that: an FDB entry pointing towards a
> logical (as opposed to physical) port. So when the second port joins the
> bridge, it doesn't realy need to replay FDB entries, since there is
> already at least one hardware port which has been receiving those
> events, and the FDB entries don't need to be added a second time to the
> same logical port.
> In the other corner, we have the drivers that handle switchdev port
> attributes on a LAG as individual switchdev port attributes on physical
> ports (example: VLAN filtering). In fact, the switchdev_handle_port_attr_set
> helper facilitates this: it is a fan-out from a single orig_dev towards
> multiple lowers that pass the check_cb().
> But that's the point: switchdev_handle_port_attr_set is just a helper
> which the driver _opts_ to use. The bridge can't enforce the "push"
> model, because that would assume that all drivers handle port attributes
> in the same way, which is probably false.
>
> For this reason, I preferred to go with the "pull" mode for this patch
> set. Just to see how bad it is for other switchdev drivers to copy-paste
> this logic, I added the pull support to ocelot too, and I think it's
> pretty manageable.
>
> Vladimir Oltean (12):
> net: dsa: call dsa_port_bridge_join when joining a LAG that is already
> in a bridge
> net: dsa: pass extack to dsa_port_{bridge,lag}_join
> net: dsa: inherit the actual bridge port flags at join time
> net: dsa: sync up with bridge port's STP state when joining
> net: dsa: sync up VLAN filtering state when joining the bridge
> net: dsa: sync multicast router state when joining the bridge
> net: dsa: sync ageing time when joining the bridge
> net: dsa: replay port and host-joined mdb entries when joining the
> bridge
> net: dsa: replay port and local fdb entries when joining the bridge
> net: dsa: replay VLANs installed on port when joining the bridge
> net: ocelot: call ocelot_netdevice_bridge_join when joining a bridged
> LAG
> net: ocelot: replay switchdev events when joining bridge
>
> drivers/net/dsa/ocelot/felix.c | 4 +-
> drivers/net/ethernet/mscc/ocelot.c | 18 +--
> drivers/net/ethernet/mscc/ocelot_net.c | 208 +++++++++++++++++++++----
> include/linux/if_bridge.h | 40 +++++
> include/net/switchdev.h | 1 +
> include/soc/mscc/ocelot.h | 6 +-
> net/bridge/br_fdb.c | 52 +++++++
> net/bridge/br_mdb.c | 84 ++++++++++
> net/bridge/br_stp.c | 27 ++++
> net/bridge/br_vlan.c | 71 +++++++++
> net/dsa/dsa_priv.h | 9 +-
> net/dsa/port.c | 203 ++++++++++++++++++------
> net/dsa/slave.c | 11 +-
> 13 files changed, 631 insertions(+), 103 deletions(-)
>

Hi Vladimir,
Please pull all of the new bridge code into separate patches with the proper
bridge subsystems tagged in the subject.
I'll review the bridge changes in a minute.

Thanks,
Nik