Re: [PATCH iwl-next v5 3/5] ice: migrate to netdev ops lock
From: Simon Horman
Date: Sun May 10 2026 - 04:58:03 EST
On Fri, May 08, 2026 at 03:57:25PM +0200, Alexander Lobakin wrote:
> From: Simon Horman <horms@xxxxxxxxxx>
> Date: Fri, 8 May 2026 14:28:29 +0100
>
> > From: 'Simon Horman' <horms@xxxxxxxxxx>
> >
> > This is an AI-generated review of your patch. The human sending this
> > email has considered the AI review valid, or at least plausible.
> > Full review at: https://sashiko.dev
> > ---
> > ice: migrate to netdev ops lock
> >
> > This commit migrates queue management operations to use netdev locking.
> > It updates several NAPI and VSI configuration functions to use their
> > _locked variants to avoid deadlocks, taking the lock manually where needed.
> >
> >> diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> >> index 16aa255351523..7d89c0acc5d8d 100644
> >> --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> >> +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> > [ ... ]
> >> @@ -1107,7 +1106,7 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf,
> >>
> >> rtnl_lock();
> >> /* disable VSIs affected by DCB changes */
> >> - ice_dcb_ena_dis_vsi(pf, false, true);
> >> + ice_dcb_ena_dis_vsi(pf, false);
> >>
> >> ret = ice_query_port_ets(pi, &buf, sizeof(buf), NULL);
> >> if (ret) {
> >> @@ -1119,7 +1118,7 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf,
> >> ice_pf_dcb_recfg(pf, false);
> >>
> >> /* enable previously downed VSIs */
> >> - ice_dcb_ena_dis_vsi(pf, true, true);
> >> + ice_dcb_ena_dis_vsi(pf, true);
> >> unlock_rtnl:
> >> rtnl_unlock();
> >> out:
> >
> > This wasn't introduced by this patch, but does this error path leave VSIs
> > permanently disabled?
> >
> > If ice_query_port_ets() fails, the code jumps to unlock_rtnl, bypassing the
> > call to ice_dcb_ena_dis_vsi(pf, true). This seems to leave the affected VSIs
> > (including the main PF interface) in a disabled state (ICE_VSI_DOWN) with
> > ICE_VSI_NEEDS_RESTART set, which could result in a denial of service.
> >
> >> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c
> >> index 2e4f0969035f7..af0cc77fbf718 100644
> >> --- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
> >> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
> >> @@ -23,10 +23,16 @@ static int ice_eswitch_setup_env(struct ice_pf *pf)
> >> struct net_device *netdev = uplink_vsi->netdev;
> >> bool if_running = netif_running(netdev);
> >> struct ice_vsi_vlan_ops *vlan_ops;
> >> + int ret;
> >> +
> >> + if (if_running && !test_and_set_bit(ICE_VSI_DOWN, uplink_vsi->state)) {
> >> + netdev_lock(netdev);
> >> + ret = ice_down(uplink_vsi);
> >> + netdev_unlock(netdev);
> >>
> >> - if (if_running && !test_and_set_bit(ICE_VSI_DOWN, uplink_vsi->state))
> >> - if (ice_down(uplink_vsi))
> >> + if (ret)
> >> return -ENODEV;
> >> + }
> >
> > Simon says: It seems that the topic of locks in this patch has come up
> > again ¯\_(ツ)_/¯
> >
> > Does manipulating the state bit outside of netdev_lock create a race
> > condition with interface teardown?
> >
> > If the interface is concurrently brought down, ndo_stop (which holds
> > netdev_lock) would call ice_vsi_close(). Because ICE_VSI_DOWN is already
> > set by the test_and_set_bit() call above, ice_vsi_close() would skip calling
> > ice_down() and immediately proceed to free the Tx/Rx rings.
> >
> > This would leave the hardware queues active and performing DMA operations into
> > freed memory, potentially causing a use-after-free.
>
> I ran Sashiko internally a couple times and it hasn't found these =\
>
> Anyway, now that I anyway need to adjust the series after
> .ndo_set_rx_mode_asyn() landed, I'll take a look at this, too.
>
> (I hate this series already. I feel like if Sashiko was alive when other
> vendors were switching to the netdev lock, they'd have the same drama)
Life is certainly different with Sashiko.
No question about it.
>
> (I was planning to add devmem/io_uring support to idpf after this lands,
> but given that idpf is way more confusing and inconsistent in locking,
> I'm not sure I'll have enough patience :D)
>
> Thanks,
> Olek