Re: [PATCH net] ice: Fix NULL pointer access, if PF doesn't support SRIOV_LAG
From: Thomas Bogendoerfer
Date: Tue Sep 03 2024 - 17:44:13 EST
On Fri, 30 Aug 2024 17:12:56 +0000
"Ertman, David M" <david.m.ertman@xxxxxxxxx> wrote:
> > -----Original Message-----
> > From: Thomas Bogendoerfer <tbogendoerfer@xxxxxxx>
> > Sent: Tuesday, August 27, 2024 12:12 PM
> > To: Kitszel, Przemyslaw <przemyslaw.kitszel@xxxxxxxxx>
> > Cc: Nguyen, Anthony L <anthony.l.nguyen@xxxxxxxxx>; David S. Miller
> > <davem@xxxxxxxxxxxxx>; Eric Dumazet <edumazet@xxxxxxxxxx>; Jakub
> > Kicinski <kuba@xxxxxxxxxx>; Paolo Abeni <pabeni@xxxxxxxxxx>; intel-
> > wired-lan@xxxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx; Ertman, David M <david.m.ertman@xxxxxxxxx>; Jiri
> > Pirko <jiri@xxxxxxxxxxx>
> > Subject: Re: [PATCH net] ice: Fix NULL pointer access, if PF doesn't support
> > SRIOV_LAG
> >
> > On Tue, 27 Aug 2024 09:16:51 +0200
> > Przemek Kitszel <przemyslaw.kitszel@xxxxxxxxx> wrote:
> >
> > > On 8/26/24 12:17, Thomas Bogendoerfer wrote:
> > > > On Mon, 26 Aug 2024 11:41:19 +0200
> > > > Jiri Pirko <jiri@xxxxxxxxxxx> wrote:
> > > >
> > > >> Mon, Aug 26, 2024 at 10:58:30AM CEST, tbogendoerfer@xxxxxxx wrote:
> > > >>> For PFs, which don't support SRIOV_LAG, there is no pf->lag struct
> > > >>> allocated. So before accessing pf->lag a NULL pointer check is needed.
> > > >>>
> > > >>> Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@xxxxxxx>
> > > >>
> > > >> You need to add a "fixes" tag blaming the commit that introduced the
> > > >> bug.
> > >
> > > Would be also good to CC the author.
> >
> > sure, I'm using get_maintainer for building address line and looks
> > like it only adds the author, if there is a Fixes tag, which IMHO
> > makes more sense than mailing all possible authors of file (in this
> > case it would work, but there are other files).
> >
> > > > Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for
> > > > SRIOV on bonded interface")
> > >
> > > the bug was introduced later, the tag should be:
> > > Fixes: ec5a6c5f79ed ("ice: process events created by lag netdev event
> > > handler")
> >
> > I'd like to disagree, ec5a6c5f79ed adds an empty
> > ice_lag_move_new_vf_nodes(),
> > which will do no harm if pf->lag is NULL. Commit 1e0f9881ef79 introduces
> > the access to pf->lag without checking for NULL.
> > >
> > > The mentioned commit extracted code into
> > ice_lag_move_new_vf_nodes(),
> > > and there is just one call to this function by now, just after
> > > releasing lag_mutex, so would be good to change the semantics of
> > > ice_lag_move_new_vf_nodes() to "only for lag-enabled flows, with
> > > lag_mutex held", and fix the call to it to reflect that.
> >
> > I could do that for sure, but IMHO this is about fixing a bug,
> > which crashes the kernel. Making the code better should be done
> > after fixing.
>
> Thomas,
>
> Nice catch!
>
> I looked into this a bit and it seems that when I sent in patch:
> commit 9f74a3dfcf83 ("ice: Fix VF Reset paths when interface in a failed over aggregate)
>
> I left in a spurious call to the previous function for moving nodes. Since it is
> just in the error path it went unnoticed this long.
>
> Since this is the only call to ice_lag_move_new_vf_nodes(), it seems that
> proper way of fixing this would be to eliminate the spurious call and the function
> definition entirely.
>
> If you do no want to do this, I can volunteer to write the patch.
either way is fine. But shouldn't the fix alone just applied first ?
Who will pick it up ?
Thomas.
--
SUSE Software Solutions Germany GmbH
HRB 36809 (AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew McDonald, Werner Knoblich