[PATCH iwl-net 2/2] ice: preserve uplink DFLT Rx rule on switchdev release

From: Petr Oros

Date: Thu Jun 18 2026 - 11:10:09 EST


ice_eswitch_setup_env() calls ice_set_dflt_vsi() to install the
ICE_SW_LKUP_DFLT Rx rule on the uplink VSI. The helper returns 0 even
when the rule is already in place, so the call is a no-op if
ice_vsi_sync_fltr() had previously installed the DFLT rule in response
to IFF_PROMISC on the uplink netdev. ice_remove_vsi_fltr() called
earlier in ice_eswitch_setup_env() does not affect this rule because
ice_remove_vsi_lkup_fltr() lacks a case for ICE_SW_LKUP_DFLT and falls
into its default branch which only logs. Switchdev mode then adds an
ICE_FLTR_TX leg via ice_cfg_dflt_vsi() on the same VSI handle.

ice_eswitch_release_env() unconditionally removed both the Rx and Tx
DFLT rules. When the Rx DFLT was installed by ice_vsi_sync_fltr()
before the switchdev session started, this clobbered promisc state the
operator had asked for: the DFLT Rx rule disappeared while IFF_PROMISC
was still set on the netdev, and the IFF_PROMISC sync path was not
retriggered, so the uplink ended the session without the catch-all
rule the netdev flags requested.

Skip the Rx DFLT removal when the uplink is still promiscuous, both in
ice_eswitch_release_env() and in the err_def_tx unwind of
ice_eswitch_setup_env(). The Tx leg installed by switchdev is always
removed since switchdev owns it.

The ena_rx_filtering() call earlier in ice_eswitch_release_env() is
left unconditional: it calls ice_cfg_vlan_pruning(), which returns
without enabling pruning while the netdev is in IFF_PROMISC, so it
cannot re-enable VLAN pruning under the preserved DFLT rule and drop
tagged traffic. Pruning is re-enabled later, when the IFF_PROMISC sync
path runs after promisc is actually cleared.

Use vsi->current_netdev_flags rather than the live netdev->flags for
this test. netdev->flags is written under RTNL by dev_change_flags(),
while ice_eswitch_release_env() runs under devl_lock, so reading it
here would be a TOCTOU against a concurrent promisc change. The
IFF_PROMISC bit of current_netdev_flags is written only under
ICE_CFG_BUSY by ice_vsi_sync_fltr(), and ice_set_rx_mode() gates that
sync off for the uplink while ice_is_switchdev_running() is true. The
bit is therefore frozen for the whole session and stable when
release_env reads it.

Because the sync is gated off during the session, a promisc change the
operator makes while switchdev runs never reaches ice_vsi_sync_fltr():
current_netdev_flags keeps the value captured before the session while
netdev->flags carries the new one. Once switchdev is torn down and
pf->eswitch.is_running is cleared, schedule a filter sync from
ice_eswitch_disable_switchdev() so the suppressed change is replayed
and the DFLT Rx rule is reconciled with the current netdev flags. This
also closes the window where release_env kept the rule based on the
frozen flag but the operator had since cleared IFF_PROMISC.

Fixes: 1a1c40df2e80 ("ice: set and release switchdev environment")
Signed-off-by: Petr Oros <poros@xxxxxxxxxx>
---
drivers/net/ethernet/intel/ice/ice_eswitch.c | 32 +++++++++++++++++---
1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c
index 2e4f0969035f77..b6073fc2375019 100644
--- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
+++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
@@ -66,8 +66,10 @@ static int ice_eswitch_setup_env(struct ice_pf *pf)
ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
ICE_FLTR_TX);
err_def_tx:
- ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
- ICE_FLTR_RX);
+ /* keep the Rx DFLT rule if still promiscuous (see release_env) */
+ if (!(uplink_vsi->current_netdev_flags & IFF_PROMISC))
+ ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx,
+ false, ICE_FLTR_RX);
err_def_rx:
ice_vsi_del_vlan_zero(uplink_vsi);
err_vlan_zero:
@@ -275,11 +277,23 @@ static void ice_eswitch_release_env(struct ice_pf *pf)
vlan_ops = ice_get_compat_vsi_vlan_ops(uplink_vsi);

ice_vsi_update_local_lb(uplink_vsi, false);
+ /* No-op while IFF_PROMISC is set: ice_cfg_vlan_pruning() self-gates on
+ * it, so this cannot re-enable VLAN pruning under a preserved DFLT rule.
+ */
vlan_ops->ena_rx_filtering(uplink_vsi);
ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
ICE_FLTR_TX);
- ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
- ICE_FLTR_RX);
+
+ /* Keep the Rx DFLT rule if the uplink is still promiscuous; it must
+ * outlive the session. current_netdev_flags is used because its
+ * IFF_PROMISC bit only changes under ice_vsi_sync_fltr(), gated off
+ * during switchdev, so the read cannot race the RTNL netdev->flags.
+ * Any change made during the session is replayed on teardown.
+ */
+ if (!(uplink_vsi->current_netdev_flags & IFF_PROMISC))
+ ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx,
+ false, ICE_FLTR_RX);
+
ice_fltr_add_mac_and_broadcast(uplink_vsi,
uplink_vsi->port_info->mac.perm_addr,
ICE_FWD_TO_VSI);
@@ -327,10 +341,20 @@ static int ice_eswitch_enable_switchdev(struct ice_pf *pf)
*/
static void ice_eswitch_disable_switchdev(struct ice_pf *pf)
{
+ struct ice_vsi *uplink_vsi = pf->eswitch.uplink_vsi;
+
ice_eswitch_br_offloads_deinit(pf);
ice_eswitch_release_env(pf);

pf->eswitch.is_running = false;
+
+ /* ice_set_rx_mode() was gated off during the session; replay a filter
+ * sync so any suppressed promisc change reconciles the DFLT Rx rule.
+ */
+ set_bit(ICE_VSI_UMAC_FLTR_CHANGED, uplink_vsi->state);
+ set_bit(ICE_VSI_MMAC_FLTR_CHANGED, uplink_vsi->state);
+ set_bit(ICE_FLAG_FLTR_SYNC, pf->flags);
+ ice_service_task_schedule(pf);
}

/**
--
2.53.0