Re: [PATCH net-next v3 02/12] dpaa2-switch: add support for LAG offload
From: Ioana Ciornei
Date: Mon Jun 08 2026 - 05:21:38 EST
On Fri, Jun 05, 2026 at 07:10:47PM -0700, Jakub Kicinski wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> ---
> dpaa2-switch: add support for LAG offload
>
> This patch adds the bulk of the changes needed in order to support
> offloading of an upper bond device.
>
> > diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> > index a0bf5b50aae5f..d082ecf9b1252 100644
> > --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
> > +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c
>
> [ ... ]
>
> > @@ -2210,19 +2300,275 @@ dpaa2_switch_prechangeupper_sanity_checks(struct net_device *netdev,
> > +static int dpaa2_switch_port_bond_join(struct net_device *netdev,
> > + struct net_device *bond_dev,
> > + struct netdev_lag_upper_info *info,
> > + struct netlink_ext_ack *extack)
> > +{
>
> [ ... ]
>
> > + /* If the bond device is a switch port, join the bridge as well */
> > + bridge_dev = netdev_master_upper_dev_get(bond_dev);
> > + if (!bridge_dev || !netif_is_bridge_master(bridge_dev))
> > + return 0;
> > +
> > + err = dpaa2_switch_port_bridge_join(netdev, bridge_dev, extack);
>
> [Severity: High]
> Should switchdev_bridge_port_offload() inside dpaa2_switch_port_bridge_join()
> receive the actual bridge port (the bond device) rather than the lower
> physical port?
>
> For ports enslaved to a LAG, passing the lower physical port might cause
> switchdev to immediately abort the offload with -EINVAL, since the netdev
> is not directly enslaved to the bridge. I see this was addressed in a later
> commit adding a dpaa2_switch_port_to_bridge_port() helper.
>
> [Severity: High]
> When a port joins a LAG that is already enslaved to a bridge, is there a risk
> of dpaa2_switch_port_set_fdb() corrupting the in_use flag of active FDBs?
>
> Since the port first acquires the bond's shared FDB, and then
> dpaa2_switch_port_bridge_join() finds a port already using that same FDB,
> the code in dpaa2_switch_port_set_fdb() seems to lack a self-assignment
> check. This could mark the shared FDB as unused before reassigning it,
> allowing another standalone port to reallocate it. I note this was fixed in
> a later commit covering bond scenarios.
>
> [Severity: High]
> Will dpaa2_switch_port_set_fdb() skip bonds when iterating through lower
> devices, breaking cross-LAG forwarding?
>
> When a port joins a bridge, dpaa2_switch_port_set_fdb() uses a non-recursive
> iterator (netdev_for_each_lower_dev). If the other lower device is a bond,
> dpaa2_switch_port_dev_check() will return false, skipping nested ports and
> failing to share the hardware FDB. I note this was also fixed in a later
> commit.
As sashisko also notes, these concerns were all addressed in later
commits. After v1 I tried to reorder the patches in the hope that I
reduce the chance of this kind of feedback but the later commits use
structures / data put in place by this patch. This basically means that
reversing the order would not improve anything.
The only alternative is just to put everything in a single patch (FDB
management change, changeupper events for bond etc) but I don't think
that would make it easier to read.
Please let me know how I should proceed.
>
> > + if (err)
> > + goto err_bridge_join;
> > +
> > + return err;
> > +
> > +err_bridge_join:
> > + dpaa2_switch_port_del_vlan(port_priv, DEFAULT_VLAN_ID);
> > +err_vlan_add:
> > +err_lag_cfg:
> > + rcu_assign_pointer(port_priv->lag, NULL);
> > + dpaa2_switch_set_lag_cfg(bond_dev, lag_id, ethsw);
> > +err_egress_flood:
> > + if (port_priv->fdb != old_fdb) {
> > + new_fdb = port_priv->fdb;
> > +
> > + /* Explicitly move the port back into the old private FDB */
> > + old_fdb->in_use = true;
> > + old_fdb->bridge_dev = NULL;
> > + port_priv->fdb = old_fdb;
> > +
> > + /* Reprogram the bond FDB to no longer include this port */
> > + dpaa2_switch_fdb_set_egress_flood(ethsw, new_fdb->fdb_id);
>
> [Severity: High]
> Are we permanently leaking a hardware FDB resource in this error path?
>
> If dpaa2_switch_port_bridge_join() fails, it will have allocated a new
> intermediate FDB and marked it in_use. The cleanup block here restores
> the original old_fdb, but does not appear to set new_fdb->in_use back to
> false. Repeated failures might exhaust the fixed number of FDBs.
No, this concern is not valid. The only case in which the FDB chances
when a port join a bond is when this is not the first dpaa2-switch port
to join that bond. In that case, the new FDB was already marked in_use
prior to this moment so that change does not need to be reverted since
even if this bond join fails, that FDB will still be in use for the
other ports under the bond.
>
> > + } else {
> > + /* Same FDB means that this was the first port joining the
> > + * bond. Keep using the same FDB.
> > + */
> > + old_fdb->bridge_dev = NULL;
> > + }
>
> [ ... ]
>
> > +static int dpaa2_switch_port_bond_leave(struct net_device *netdev,
> > + struct net_device *bond_dev)
> > +{
>
> [ ... ]
>
> > +lag_cleanup:
> > + /* Recreate the LAG configuration for the LAG group that we left. In
> > + * case any step failed, at least we free up a LAG resource.
> > + */
> > + rcu_assign_pointer(port_priv->lag, NULL);
> > + dpaa2_switch_set_lag_cfg(bond_dev, lag->id, ethsw);
>
> [Severity: Critical]
> Could this trigger a kernel panic if lag is NULL on the rollback path?
>
> If dpaa2_switch_port_bond_join() fails and sets port_priv->lag to NULL, a
> subsequent NETDEV_CHANGEUPPER rollback calls this function. The lag variable
> is initialized via rtnl_dereference(port_priv->lag), which would be NULL,
> leading to a dereference when accessing lag->id here.
>From what I know, a failure in a bond join will not generate a
subsequent event with linking=false and the leave path only gets called
with a valid upper device.
Anyhow, I can add a defensive check for a NULL port_priv->lag pointer at
the start of this function.
>
> > @@ -2263,6 +2627,80 @@ static int dpaa2_switch_port_changeupper(struct net_device *netdev,
> > extack);
> > else
> > return dpaa2_switch_port_bridge_leave(netdev);
>
> [Severity: High]
> Does calling dpaa2_switch_port_bridge_leave() wrongly split LAG member ports
> into isolated private FDBs?
>
> When a LAG leaves a bridge, this is invoked for each member port. It will
> unconditionally call dpaa2_switch_port_set_fdb(port_priv, NULL), which forces
> the port to drop the shared FDB and isolates it from the other LAG members.
> I note this was also fixed in a later commit covering bond scenarios.
This is the same as above, concern addressed by a later commit.
Ioana