Re: [PATCH iwl-net v3 4/6] ice: check ICE_VSI_DOWN under rtnl_lock when preparing for reset

From: Maciej Fijalkowski
Date: Thu Aug 22 2024 - 10:43:26 EST


On Thu, Aug 22, 2024 at 02:56:50PM +0200, Larysa Zaremba wrote:
> On Thu, Aug 22, 2024 at 01:34:33PM +0200, Maciej Fijalkowski wrote:
> > On Mon, Aug 19, 2024 at 12:05:41PM +0200, Larysa Zaremba wrote:
> > > Consider the following scenario:
> > >
> > > .ndo_bpf() | ice_prepare_for_reset() |
> > > ________________________|_______________________________________|
> > > rtnl_lock() | |
> > > ice_down() | |
> > > | test_bit(ICE_VSI_DOWN) - true |
> > > | ice_dis_vsi() returns |
> > > ice_up() | |
> > > | proceeds to rebuild a running VSI |
> > >
> > > .ndo_bpf() is not the only rtnl-locked callback that toggles the interface
> > > to apply new configuration. Another example is .set_channels().
> > >
> > > To avoid the race condition above, act only after reading ICE_VSI_DOWN
> > > under rtnl_lock.
> > >
> > > Fixes: 0f9d5027a749 ("ice: Refactor VSI allocation, deletion and rebuild flow")
> > > Reviewed-by: Wojciech Drewek <wojciech.drewek@xxxxxxxxx>
> > > Reviewed-by: Jacob Keller <jacob.e.keller@xxxxxxxxx>
> > > Tested-by: Chandan Kumar Rout <chandanx.rout@xxxxxxxxx>
> > > Signed-off-by: Larysa Zaremba <larysa.zaremba@xxxxxxxxx>
> > > ---
> > > drivers/net/ethernet/intel/ice/ice_lib.c | 12 ++++++------
> > > 1 file changed, 6 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
> > > index b72338974a60..94029e446b99 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_lib.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
> > > @@ -2665,8 +2665,7 @@ int ice_ena_vsi(struct ice_vsi *vsi, bool locked)
> > > */
> > > void ice_dis_vsi(struct ice_vsi *vsi, bool locked)
> > > {
> > > - if (test_bit(ICE_VSI_DOWN, vsi->state))
> > > - return;
> > > + bool already_down = test_bit(ICE_VSI_DOWN, vsi->state);
> > >
> > > set_bit(ICE_VSI_NEEDS_RESTART, vsi->state);
> > >
> > > @@ -2674,15 +2673,16 @@ void ice_dis_vsi(struct ice_vsi *vsi, bool locked)
> > > if (netif_running(vsi->netdev)) {
> > > if (!locked)
> > > rtnl_lock();
> > > -
> > > - ice_vsi_close(vsi);
> > > + already_down = test_bit(ICE_VSI_DOWN, vsi->state);
> > > + if (!already_down)
> > > + ice_vsi_close(vsi);
> >
> > ehh sorry for being sloppy reviewer. we still are testing ICE_VSI_DOWN in
> > ice_vsi_close(). wouldn't all of this be cleaner if we would bail out of
> > the called function when bit was already set?
> >
>
> I am not sure I see the possibility to rewrite this as you suggest, we cannot
> bail out for the netif_running() case due to needing to unlock after
> ice_vsi_close() finishes. This leaves bailing out in case of CTRL VSI and
> non-running PF, which we could do, but it would require a lengthy if condition,
> which is not that much better than nested code, IMO.

Hmm. I meant to move bit checking onto ice_vsi_close() only, so you would
bail out of it in case bit has already been set.

overall, ice_dis_vsi() is a very cumbersome way of calling ice_vsi_close()
:(

I guess we can progress with what you have but i'd like to brainstorm
later about some simplification around it.

I prototyped something but not tested that, just to maybe spark a
discussion. Feels easier to read and swallow in the end. Not sure if
functionality is kept:)