Re: [PATCH net-next 02/13] dpaa2-switch: add support for LAG offload
From: Ioana Ciornei
Date: Fri May 08 2026 - 08:42:19 EST
On Wed, May 06, 2026 at 06:15:29PM +0300, Ioana Ciornei wrote:
> This patch adds the bulk of the changes needed in order to support
> offloading of an upper bond device.
>
> First of all, handling of the NETDEV_CHANGEUPPER and
> NETDEV_PRECHANGEUPPER events is extended so that the driver is capable
> to handle joining or leaving an upper bond device.
> All the restrictions around the LAG offload support are added in the
> newly added dpaa2_switch_pre_lag_join() function.
>
> The same events are extended to also detect if one of our upper bond
> devices changes its own upper device. In this case, on each lower device
> that is DPAA2 the corresponding dpaa2_switch_port_[pre]changeupper()
> function will be called. This will start the process of joining the same
> FDB as the one used by the bridge device.
>
> Setting the 'offload_fwd_mark' field on the skbs is also extended to be
> setup not only when the port is under a bridge but also under a bond
> device that is offloaded.
>
> Signed-off-by: Ioana Ciornei <ioana.ciornei@xxxxxxx>
> ---
> .../ethernet/freescale/dpaa2/dpaa2-switch.c | 390 +++++++++++++++++-
> .../ethernet/freescale/dpaa2/dpaa2-switch.h | 14 +-
> 2 files changed, 402 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> index 52c1cb9cb7e0..6367873401c0 100644
> --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> @@ -51,6 +51,17 @@ dpaa2_switch_filter_block_get_unused(struct ethsw_core *ethsw)
> return NULL;
> }
(...)
Sashiko notes:
When a port joins a bond that is already attached to a bridge,
this calls dpaa2_switch_port_bridge_join() which calls
dpaa2_switch_port_set_fdb() again. Does this second call
inadvertently find the bond's shared FDB on another port and
mark it as unused (in_use = false)?
Also, when a bonded port joins a bridge here, is it correct that
dpaa2_switch_port_bridge_join() calls
switchdev_bridge_port_offload() passing the physical port
(netdev) as both brport_dev and dev? Switchdev relies on the
bridge port state, which might fail if it is expecting the bond
device itself as the logical bridge port.
The logic around the switchdev_bridge_port_offload() is amended in a
following patch named "dpaa2-switch: add
dpaa2_switch_port_to_bridge_port() helper".
The dpaa2_switch_port_set_fdb() function is also updated in a later
commit so that it takes into account a bond upper.
Will consider reordering the patches.
> + if (err)
> + goto err_bridge_join;
> +
> + return err;
> +
> +err_bridge_join:
> +err_lag_cfg:
> + port_priv->lag = NULL;
> + dpaa2_switch_set_lag_cfg(bond_dev, lag_id, ethsw);
> +err_egress_flood:
> + dpaa2_switch_port_set_fdb(port_priv, NULL);
Sashiko notes:
If an error occurs and we jump to err_egress_flood, we revert
the software FDB pointer. Should we also call
dpsw_set_egress_flood() to revert the hardware state so the port
isn't left physically stranded in the bond's flooding domain?
(...)
> + /* Recreate the LAG configuration for the LAG group that we left */
> + port_priv->lag = NULL;
> + dpaa2_switch_set_lag_cfg(bond_dev, lag->id, ethsw);
Sashiko notes:
If the bond was attached to a bridge, do we also need to clean
up the switchdev bridge offload context here?
dpaa2_switch_port_bond_join() calls
dpaa2_switch_port_bridge_join() which allocates context via
switchdev_bridge_port_offload(), but it does not look like there
is a symmetric teardown here.
That is not true. The switchdev_bridge_port_unoffload() function is
called from dpaa2_switch_port_pre_bridge_leave().
> +
> + return 0;
> +}
> +
> static int dpaa2_switch_port_prechangeupper(struct net_device *netdev,
> struct netdev_notifier_changeupper_info *info)
> {
> @@ -2216,6 +2487,9 @@ static int dpaa2_switch_port_prechangeupper(struct net_device *netdev,
>
> if (!info->linking)
> dpaa2_switch_port_pre_bridge_leave(netdev);
> + } else if (netif_is_lag_master(upper_dev) && info->linking) {
> + return dpaa2_switch_pre_lag_join(netdev, upper_dev,
> + info->upper_info, extack);
Sashiko notes:
If a port joins a bond that is already enslaved to a bridge,
does this bypass dpaa2_switch_prechangeupper_sanity_checks() for
the indirect bridge? This might allow joining unsupported
topologies like VLAN-unaware bridges.
Good point. Will fix this case by checking if the upper_devis also a
bridge port and in case it is, to call
dpaa2_switch_prechangeupper_sanity_checks() which verifies those
unsupported topologies.
Also, looking at dpaa2_switch_port_set_fdb() and
dpaa2_switch_prechangeupper_sanity_checks(), they iterate over
the bridge's lowers using netdev_for_each_lower_dev(). Since
this only yields direct lowers, will it skip ports under a LAG?
This could cause standalone ports to miss LAG peer ports and
allocate a separate FDB.
Another good point. The netdev_for_each_lower_dev() will have to be
converted to a netdev_walk_all_lower_dev() so that we check all lowers,
not just those immediately adjacent.
(...)
> +static int
> +dpaa2_switch_lag_changeupper(struct net_device *netdev,
> + struct netdev_notifier_changeupper_info *info)
> +{
> + struct net_device *lower;
> + struct list_head *iter;
> + int err = 0;
> +
> + if (!netif_is_lag_master(netdev))
> + return 0;
> +
> + netdev_for_each_lower_dev(netdev, lower, iter) {
> + if (!dpaa2_switch_port_dev_check(lower))
> + continue;
> +
> + err = dpaa2_switch_port_changeupper(lower, info);
Sashiko notes:
When a LAG device is removed from a bridge, this cascades to
dpaa2_switch_port_bridge_leave() for each port. That function
calls dpaa2_switch_port_set_fdb(port_priv, NULL) which allocates
a standalone FDB. Does this inadvertently break the bond in
hardware by placing slave ports into isolated standalone FDBs?
This is yet another reason to reorder the patches so that
dpaa2_switch_port_set_fdb() is updated before this patch.
> + if (err)
> + return err;
Sashiko notes:
If info->linking is false (a teardown operation) and this
returns immediately on the first error, could this permanently
strand the remaining unprocessed ports in the hardware bridge
configuration? The network stack cannot revert an unlinking
operatione
Even if we would continue with the other ports the state of the entire
topology will be inconsistent so I don't know if it's worth going
forward will the remaining ports.