Re: [PATCH net] net/mlx5: Skip disabled vports when setting max TX speed

From: Jakub Kicinski

Date: Thu May 14 2026 - 21:35:33 EST

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/mlx5: Skip disabled vports when setting max TX speed

This commit modifies the LAG activation process to skip setting the maximum
TX speed for disabled vports, preventing firmware commands from being sent
to uninitialized vports. It stores the aggregated speed in the vport
structure so it can be applied later if the vport is enabled.

> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c

[ ... ]

> @@ -948,6 +966,9 @@ int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, struct mlx5_vport *vport,
>
> esw->enabled_vports++;
> esw_debug(esw->dev, "Enabled VPORT(%d)\n", vport_num);
> +
> + if (vport->agg_max_tx_speed)

Could dynamically allocated vports bypass the active LAG QoS speed limits?

If a new vport is allocated dynamically after the LAG is already active,
it appears agg_max_tx_speed would be zero-initialized. When this new vport
is subsequently enabled here, the condition evaluates to false, which
would skip the firmware speed update entirely.

> + mlx5_esw_vport_set_max_tx_speed(esw, vport);

Does this introduce a race condition with mlx5_lag_modify_device_vports_speed?

If the LAG thread is concurrently updating the speed:

LAG thread (lockless):
vport->agg_max_tx_speed = speed;
if (!vport->enabled)
continue;

ESW thread (with esw->state_lock):
vport->enabled = true;
...
if (vport->agg_max_tx_speed)
mlx5_esw_vport_set_max_tx_speed(esw, vport);

Since there are no shared locks or explicit memory barriers, could CPU
reordering allow the LAG thread to see a false enabled state while the ESW
thread sees a zero agg_max_tx_speed, resulting in neither thread issuing the
firmware command?

> done:
> mutex_unlock(&esw->state_lock);

[ ... ]

> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c

[ ... ]

> @@ -1274,6 +1274,11 @@ static void mlx5_lag_modify_device_vports_speed(struct mlx5_core_dev *mdev,
> if (vport->vport == MLX5_VPORT_UPLINK)
> continue;
>
> + vport->agg_max_tx_speed = speed;
> +
> + if (!vport->enabled)

Is there a regression here regarding the lifetime of the vport pointer?

This loop locklessly iterates over the esw->vports xarray. Because the xarray
iteration drops the RCU read lock before executing the loop body, and
mlx5_esw_vport_free() calls kfree() synchronously without an RCU grace
period, could a concurrent vport deletion cause these new field accesses to
operate on freed memory?

> + continue;
> +
> ret = mlx5_modify_vport_max_tx_speed(mdev, op_mod,
> vport->vport, true, speed);

Does this code contain a time-of-check to time-of-use regression
that is affected by the lockless architecture here?

Looking at mlx5_modify_vport_max_tx_speed(), it queries the firmware
admin_state and writes it back without holding esw->state_lock. If the LAG
thread reads an UP state, but mlx5_esw_vport_disable() concurrently disables
the vport (setting the hardware state to DOWN), could the LAG thread
accidentally overwrite the admin_state back to UP, effectively reverting the
disable action?

> if (ret)
--
pw-bot: cr