RE: [Intel-wired-lan] [PATCH iwl-net 2/2] ice: preserve uplink DFLT Rx rule on switchdev release

From: Loktionov, Aleksandr

Date: Thu Jun 18 2026 - 12:03:09 EST




> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@xxxxxxxxxx> On Behalf
> Of Petr Oros
> Sent: Thursday, June 18, 2026 5:09 PM
> To: netdev@xxxxxxxxxxxxxxx
> Cc: Vecera, Ivan <ivecera@xxxxxxxxxx>; Alice Michael
> <alice.michael@xxxxxxxxx>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@xxxxxxxxx>; Eric Dumazet <edumazet@xxxxxxxxxx>;
> linux-kernel@xxxxxxxxxxxxxxx; Andrew Lunn <andrew+netdev@xxxxxxx>;
> Nguyen, Anthony L <anthony.l.nguyen@xxxxxxxxx>; Michal Swiatkowski
> <michal.swiatkowski@xxxxxxxxxxxxxxx>; Keller, Jacob E
> <jacob.e.keller@xxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; Paolo
> Abeni <pabeni@xxxxxxxxxx>; David S. Miller <davem@xxxxxxxxxxxxx>;
> intel-wired-lan@xxxxxxxxxxxxxxxx
> Subject: [Intel-wired-lan] [PATCH iwl-net 2/2] ice: preserve uplink
> DFLT Rx rule on switchdev release
>
> ice_eswitch_setup_env() calls ice_set_dflt_vsi() to install the
> ICE_SW_LKUP_DFLT Rx rule on the uplink VSI. The helper returns 0 even
> when the rule is already in place, so the call is a no-op if
> ice_vsi_sync_fltr() had previously installed the DFLT rule in response
> to IFF_PROMISC on the uplink netdev. ice_remove_vsi_fltr() called
> earlier in ice_eswitch_setup_env() does not affect this rule because
> ice_remove_vsi_lkup_fltr() lacks a case for ICE_SW_LKUP_DFLT and falls
> into its default branch which only logs. Switchdev mode then adds an
> ICE_FLTR_TX leg via ice_cfg_dflt_vsi() on the same VSI handle.
>
> ice_eswitch_release_env() unconditionally removed both the Rx and Tx
> DFLT rules. When the Rx DFLT was installed by ice_vsi_sync_fltr()
> before the switchdev session started, this clobbered promisc state the
> operator had asked for: the DFLT Rx rule disappeared while IFF_PROMISC
> was still set on the netdev, and the IFF_PROMISC sync path was not
> retriggered, so the uplink ended the session without the catch-all
> rule the netdev flags requested.
>
> Skip the Rx DFLT removal when the uplink is still promiscuous, both in
> ice_eswitch_release_env() and in the err_def_tx unwind of
> ice_eswitch_setup_env(). The Tx leg installed by switchdev is always
> removed since switchdev owns it.
>
> The ena_rx_filtering() call earlier in ice_eswitch_release_env() is
> left unconditional: it calls ice_cfg_vlan_pruning(), which returns
> without enabling pruning while the netdev is in IFF_PROMISC, so it
> cannot re-enable VLAN pruning under the preserved DFLT rule and drop
> tagged traffic. Pruning is re-enabled later, when the IFF_PROMISC sync
> path runs after promisc is actually cleared.
>
> Use vsi->current_netdev_flags rather than the live netdev->flags for
> this test. netdev->flags is written under RTNL by dev_change_flags(),
> while ice_eswitch_release_env() runs under devl_lock, so reading it
> here would be a TOCTOU against a concurrent promisc change. The
> IFF_PROMISC bit of current_netdev_flags is written only under
> ICE_CFG_BUSY by ice_vsi_sync_fltr(), and ice_set_rx_mode() gates that
> sync off for the uplink while ice_is_switchdev_running() is true. The
> bit is therefore frozen for the whole session and stable when
> release_env reads it.
>
> Because the sync is gated off during the session, a promisc change the
> operator makes while switchdev runs never reaches ice_vsi_sync_fltr():
> current_netdev_flags keeps the value captured before the session while
> netdev->flags carries the new one. Once switchdev is torn down and
> pf->eswitch.is_running is cleared, schedule a filter sync from
> ice_eswitch_disable_switchdev() so the suppressed change is replayed
> and the DFLT Rx rule is reconciled with the current netdev flags. This
> also closes the window where release_env kept the rule based on the
> frozen flag but the operator had since cleared IFF_PROMISC.
>
> Fixes: 1a1c40df2e80 ("ice: set and release switchdev environment")
> Signed-off-by: Petr Oros <poros@xxxxxxxxxx>
> ---
> drivers/net/ethernet/intel/ice/ice_eswitch.c | 32 +++++++++++++++++--
> -
> 1 file changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c
> b/drivers/net/ethernet/intel/ice/ice_eswitch.c
> index 2e4f0969035f77..b6073fc2375019 100644
> --- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
> @@ -66,8 +66,10 @@ static int ice_eswitch_setup_env(struct ice_pf *pf)
> ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
> ICE_FLTR_TX);
> err_def_tx:
> - ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
> - ICE_FLTR_RX);
> + /* keep the Rx DFLT rule if still promiscuous (see release_env)
> */
> + if (!(uplink_vsi->current_netdev_flags & IFF_PROMISC))
> + ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx,
> + false, ICE_FLTR_RX);
> err_def_rx:
> ice_vsi_del_vlan_zero(uplink_vsi);
> err_vlan_zero:
> @@ -275,11 +277,23 @@ static void ice_eswitch_release_env(struct
> ice_pf *pf)
> vlan_ops = ice_get_compat_vsi_vlan_ops(uplink_vsi);
>
> ice_vsi_update_local_lb(uplink_vsi, false);
> + /* No-op while IFF_PROMISC is set: ice_cfg_vlan_pruning() self-
> gates on
> + * it, so this cannot re-enable VLAN pruning under a preserved
> DFLT rule.
> + */
> vlan_ops->ena_rx_filtering(uplink_vsi);
> ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
> ICE_FLTR_TX);
> - ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
> - ICE_FLTR_RX);
> +
> + /* Keep the Rx DFLT rule if the uplink is still promiscuous; it
> must
> + * outlive the session. current_netdev_flags is used because
> its
> + * IFF_PROMISC bit only changes under ice_vsi_sync_fltr(),
> gated off
> + * during switchdev, so the read cannot race the RTNL netdev-
> >flags.
> + * Any change made during the session is replayed on teardown.
> + */
> + if (!(uplink_vsi->current_netdev_flags & IFF_PROMISC))
> + ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx,
> + false, ICE_FLTR_RX);
> +
> ice_fltr_add_mac_and_broadcast(uplink_vsi,
> uplink_vsi->port_info-
> >mac.perm_addr,
> ICE_FWD_TO_VSI);
> @@ -327,10 +341,20 @@ static int ice_eswitch_enable_switchdev(struct
> ice_pf *pf)
> */
> static void ice_eswitch_disable_switchdev(struct ice_pf *pf) {
> + struct ice_vsi *uplink_vsi = pf->eswitch.uplink_vsi;
> +
> ice_eswitch_br_offloads_deinit(pf);
> ice_eswitch_release_env(pf);
>
> pf->eswitch.is_running = false;
> +
> + /* ice_set_rx_mode() was gated off during the session; replay a
> filter
> + * sync so any suppressed promisc change reconciles the DFLT Rx
> rule.
> + */
> + set_bit(ICE_VSI_UMAC_FLTR_CHANGED, uplink_vsi->state);
> + set_bit(ICE_VSI_MMAC_FLTR_CHANGED, uplink_vsi->state);
> + set_bit(ICE_FLAG_FLTR_SYNC, pf->flags);
> + ice_service_task_schedule(pf);
> }
>
> /**
> --
> 2.53.0

Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@xxxxxxxxx>