Re: [PATCH net] lan966x: Fix crash when adding interface under a lag

From: Horatiu Vultur
Date: Mon Feb 05 2024 - 04:44:51 EST


The 02/05/2024 09:44, Michal Swiatkowski wrote:

Hi Michal,

>
> On Mon, Feb 05, 2024 at 09:07:56AM +0100, Horatiu Vultur wrote:
> > There is a crash when adding one of the lan966x interfaces under a lag
> > interface. The issue can be reproduced like this:
> > ip link add name bond0 type bond miimon 100 mode balance-xor
> > ip link set dev eth0 master bond0
> >
> > The reason is because when adding a interface under the lag it would go
> > through all the ports and try to figure out which other ports are under
> > that lag interface. And the issue is that lan966x can have ports that are
> > NULL pointer as they are not probed. So then iterating over these ports
> > it would just crash as they are NULL pointers.
> > The fix consists in actually checking for NULL pointers before accessing
> > something from the ports. Like we do in other places.
> >
> > Fixes: cabc9d49333d ("net: lan966x: Add lag support for lan966x")
> > Signed-off-by: Horatiu Vultur <horatiu.vultur@xxxxxxxxxxxxx>
> > ---
> > drivers/net/ethernet/microchip/lan966x/lan966x_lag.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> > index 41fa2523d91d3..89a2c3176f1da 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_lag.c
> > @@ -37,19 +37,24 @@ static void lan966x_lag_set_aggr_pgids(struct lan966x *lan966x)
> >
> > /* Now, set PGIDs for each active LAG */
> > for (lag = 0; lag < lan966x->num_phys_ports; ++lag) {
> > - struct net_device *bond = lan966x->ports[lag]->bond;
> > + struct lan966x_port *port = lan966x->ports[lag];
> > int num_active_ports = 0;
> > + struct net_device *bond;
> > unsigned long bond_mask;
> > u8 aggr_idx[16];
> >
> > - if (!bond || (visited & BIT(lag)))
> > + if (!port || !port->bond || (visited & BIT(lag)))
> > continue;
> >
> > + bond = lan966x->ports[lag]->bond;
> Why not bond = port->bond?

That is also correct and more clear.
I think I just copy the line that I have removed and put it here. As it
has the same effect.
I can update this in the next version.

>
> > bond_mask = lan966x_lag_get_mask(lan966x, bond);
> >
> > for_each_set_bit(p, &bond_mask, lan966x->num_phys_ports) {
> > struct lan966x_port *port = lan966x->ports[p];
> >
> > + if (!port)
> > + continue;
> > +
> > lan_wr(ANA_PGID_PGID_SET(bond_mask),
> > lan966x, ANA_PGID(p));
> > if (port->lag_tx_active)
> > --
> > 2.34.1
> >
> Only nit, otherwise:
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@xxxxxxxxxxxxxxx>
>
> Thanks,
> Michal

--
/Horatiu