Re: [PATCH iwl-net v2] idpf: do not perform flow ops when netdev is detached

From: Simon Horman

Date: Thu Apr 23 2026 - 12:46:39 EST


From: 'Simon Horman' <horms@xxxxxxxxxx>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
idpf: do not perform flow ops when netdev is detached

This commit adds checks to prevent crashes when out-of-tree modules bypass
ethtool core locks and call flow operations on detached netdevs during a
hardware reset. It introduces netdev and netif_device_present checks in
idpf_get_rxnfc and idpf_set_rxnfc.

> Even though commit 2e281e1155fc ("idpf: detach and close netdevs while
> handling a reset") prevents ethtool -N/-n operations to operate on
> detached netdevs, we found that out-of-tree workflows like OpenOnload
> can bypass ethtool core locks and call idpf_set_rxnfc directly during
> an idpf HW reset.

Is it acceptable to add defensive checks upstream solely to accommodate
out-of-tree modules that bypass core kernel locking?

The core ethtool infrastructure manages locks like rtnl_lock for in-tree
callers to guarantee safety. Modifying the upstream driver to paper over
locking violations in out-of-tree modules might contradict upstream policies.

Simon says: If this is to address OOT module behaviour then I am concerned.
And I would suggest that the correct approach is to fix
the OOT kernel module.

> diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
> index bb99d9e7c65d5..8368a7e6a7544 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
> +++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
> @@ -43,6 +43,9 @@ static int idpf_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd,
> unsigned int cnt = 0;
> int err = 0;
>
> + if (!netdev || !netif_device_present(netdev))
> + return -ENODEV;
> +

Could this result in undefined behavior if netdev is NULL? Before this check,
np is initialized using netdev_priv(netdev), which does pointer arithmetic.
If netdev can genuinely be NULL from out-of-tree callers, should the
initialization of np be moved after the validation?

Simon says: I think the actual problem here is that netdev is never NULL.
And thus the !netdev check should not be added.

Likewise for idpf_set_rxnfc()

Also, is there a time-of-check to time-of-use race here?

The netif_device_present() check is performed outside the vport_ctrl_lock
critical section.

If a concurrent hardware reset detaches the device and clears the present bit
before acquiring the vport_ctrl_lock to set np->vport = NULL, this thread
might pass the present check, block on idpf_vport_ctrl_lock(netdev), and
then dereference the newly set NULL vport once the lock is acquired.

Should the device presence check be moved inside the vport_ctrl_lock
critical section to prevent the NULL pointer dereference?

Simon says: This seems to be a legitimate concern.
Which seems to further suggest that the problem
should be addressed by the caller.

> idpf_vport_ctrl_lock(netdev);

[ ... ]

> @@ -349,6 +352,9 @@ static int idpf_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
> {
> int ret = -EOPNOTSUPP;
>
> + if (!netdev || !netif_device_present(netdev))
> + return -ENODEV;
> +
> idpf_vport_ctrl_lock(netdev);

Would this be vulnerable to the same time-of-check to time-of-use race
mentioned above, since the present check is outside the lock?