Re: [PATCH net] net/mlx5e: XSK, Fix unintended ICOSQ change

From: Alice Mikityanska

Date: Tue Feb 17 2026 - 11:52:34 EST


On Tue, Feb 17, 2026, at 09:45, Tariq Toukan wrote:
> XSK wakeup must use the async ICOSQ (with proper locking), as it is not
> guaranteed to run on the same CPU as the channel.
>
> The commit that converted the NAPI trigger path to use the sync ICOSQ
> incorrectly applied the same change to XSK, causing XSK wakeups to use
> the sync ICOSQ as well. Revert XSK flows to use the async ICOSQ.
>
> XDP program attach/detach triggers channel reopen, while XSK pool
> enable/disable can happen on-the-fly via NDOs without reopening
> channels. As a result, xsk_pool state cannot be reliably used at
> mlx5e_open_channel() time to decide whether an async ICOSQ is needed.
>
> Update the async_icosq_needed logic to depend on the presence of an XDP
> program rather than the xsk_pool, ensuring the async ICOSQ is available
> when XSK wakeups are enabled.
>
> This fixes multiple issues:
>
> 1. Illegal synchronize_rcu() in an RCU read- side critical section via
> mlx5e_xsk_wakeup() -> mlx5e_trigger_napi_icosq() ->
> synchronize_net(). The stack holds RCU read-lock in xsk_poll().
>
> 2. Hitting a NULL pointer dereference in mlx5e_xsk_wakeup():
>
> [] BUG: kernel NULL pointer dereference, address: 0000000000000240
> [] #PF: supervisor read access in kernel mode
> [] #PF: error_code(0x0000) - not-present page
> [] PGD 0 P4D 0
> [] Oops: Oops: 0000 [#1] SMP
> [] CPU: 0 UID: 0 PID: 2255 Comm: qemu-system-x86 Not tainted
> 6.19.0-rc5+ #229 PREEMPT(none)
> [] Hardware name: [...]
> [] RIP: 0010:mlx5e_xsk_wakeup+0x53/0x90 [mlx5_core]
>
> Reported-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> Closes:
> https://lore.kernel.org/all/20260123223916.361295-1-daniel@xxxxxxxxxxxxx/
> Fixes: 56aca3e0f730 ("net/mlx5e: Use regular ICOSQ for triggering NAPI")
> Tested-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> Signed-off-by: Tariq Toukan <tariqt@xxxxxxxxxx>
> Reviewed-by: Dragos Tatulea <dtatulea@xxxxxxxxxx>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 +
> .../ethernet/mellanox/mlx5/core/en/xsk/pool.c | 4 ++--
> .../ethernet/mellanox/mlx5/core/en/xsk/tx.c | 2 +-
> .../net/ethernet/mellanox/mlx5/core/en_main.c | 24 +++++++++++++------
> 4 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> index a7de3a3efc49..19fce51117c9 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
> @@ -1103,6 +1103,7 @@ int mlx5e_open_locked(struct net_device *netdev);
> int mlx5e_close_locked(struct net_device *netdev);
>
> void mlx5e_trigger_napi_icosq(struct mlx5e_channel *c);
> +void mlx5e_trigger_napi_async_icosq(struct mlx5e_channel *c);
> void mlx5e_trigger_napi_sched(struct napi_struct *napi);
>
> int mlx5e_open_channels(struct mlx5e_priv *priv,
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
> index db776e515b6a..5c5360a25c64 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
> @@ -127,7 +127,7 @@ static int mlx5e_xsk_enable_locked(struct
> mlx5e_priv *priv,
> goto err_remove_pool;
>
> mlx5e_activate_xsk(c);
> - mlx5e_trigger_napi_icosq(c);
> + mlx5e_trigger_napi_async_icosq(c);
>
> /* Don't wait for WQEs, because the newer xdpsock sample doesn't
> provide
> * any Fill Ring entries at the setup stage.
> @@ -179,7 +179,7 @@ static int mlx5e_xsk_disable_locked(struct
> mlx5e_priv *priv, u16 ix)
> c = priv->channels.c[ix];
>
> mlx5e_activate_rq(&c->rq);
> - mlx5e_trigger_napi_icosq(c);
> + mlx5e_trigger_napi_async_icosq(c);
> mlx5e_wait_for_min_rx_wqes(&c->rq, MLX5E_RQ_WQES_TIMEOUT);
>
> mlx5e_rx_res_xsk_update(priv->rx_res, &priv->channels, ix, false);
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c
> index 9e33156fac8a..8aeab4b21035 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c
> @@ -34,7 +34,7 @@ int mlx5e_xsk_wakeup(struct net_device *dev, u32 qid,
> u32 flags)
> &c->async_icosq->state))
> return 0;
>
> - mlx5e_trigger_napi_icosq(c);
> + mlx5e_trigger_napi_async_icosq(c);
> }
>
> return 0;
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index 4b8084420816..6a7ca4571c19 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -2744,16 +2744,26 @@ static int mlx5e_channel_stats_alloc(struct
> mlx5e_priv *priv, int ix, int cpu)
>
> void mlx5e_trigger_napi_icosq(struct mlx5e_channel *c)
> {
> + struct mlx5e_icosq *sq = &c->icosq;
> bool locked;
>
> - if (!test_and_set_bit(MLX5E_SQ_STATE_LOCK_NEEDED, &c->icosq.state))
> - synchronize_net();
> + set_bit(MLX5E_SQ_STATE_LOCK_NEEDED, &sq->state);
> + synchronize_net();
>
> - locked = mlx5e_icosq_sync_lock(&c->icosq);
> - mlx5e_trigger_irq(&c->icosq);
> - mlx5e_icosq_sync_unlock(&c->icosq, locked);
> + locked = mlx5e_icosq_sync_lock(sq);
> + mlx5e_trigger_irq(sq);
> + mlx5e_icosq_sync_unlock(sq, locked);
>
> - clear_bit(MLX5E_SQ_STATE_LOCK_NEEDED, &c->icosq.state);
> + clear_bit(MLX5E_SQ_STATE_LOCK_NEEDED, &sq->state);
> +}
> +
> +void mlx5e_trigger_napi_async_icosq(struct mlx5e_channel *c)
> +{
> + struct mlx5e_icosq *sq = c->async_icosq;
> +
> + spin_lock_bh(&sq->lock);
> + mlx5e_trigger_irq(sq);
> + spin_unlock_bh(&sq->lock);
> }
>
> void mlx5e_trigger_napi_sched(struct napi_struct *napi)
> @@ -2836,7 +2846,7 @@ static int mlx5e_open_channel(struct mlx5e_priv
> *priv, int ix,
> netif_napi_add_config_locked(netdev, &c->napi, mlx5e_napi_poll, ix);
> netif_napi_set_irq_locked(&c->napi, irq);
>
> - async_icosq_needed = !!xsk_pool || priv->ktls_rx_was_enabled;
> + async_icosq_needed = !!params->xdp_prog || priv->ktls_rx_was_enabled;

Acked-by: Alice Mikityanska <alice.kernel@xxxxxxxxxxx>

With a follow-up suggestion that we discussed at:

https://lore.kernel.org/netdev/8a3a3ff4-16c7-4d99-8854-38d741cc6b82@xxxxxxxxx/

> err = mlx5e_open_queues(c, params, cparam, async_icosq_needed);
> if (unlikely(err))
> goto err_napi_del;
>
> base-commit: ee5492fd88cfc079c19fbeac78e9e53b7f6c04f3
> --
> 2.44.0