Re: [PATCH net 3/3] net/mlx5: SD: Serialize init/cleanup

From: Jakub Kicinski

Date: Wed Apr 01 2026 - 23:16:36 EST


This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/mlx5: SD: Serialize init/cleanup

This commit introduces a state tracker and locking around the Socket-Direct
init and cleanup routines. It prevents multiple physical functions from
concurrently bringing up or tearing down the SD group.

> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
> index 4c80b9d25283..374f27b78fbe 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c

[ ... ]

> @@ -505,11 +520,16 @@ void mlx5_sd_cleanup(struct mlx5_core_dev *dev)
> if (!sd)
> return;
>
> + mlx5_devcom_comp_lock(sd->devcom);
> if (!mlx5_devcom_comp_is_ready(sd->devcom))
> - goto out;
> + goto out_unlock;
>
> primary = mlx5_sd_get_primary(dev);
> primary_sd = mlx5_get_sd(primary);
> +
> + if (primary_sd->state != MLX5_SD_STATE_UP)
> + goto out_unlock;
> +
> mlx5_sd_for_each_secondary(i, primary, pos)
> sd_cmd_unset_secondary(pos);
> sd_cmd_unset_primary(primary);
> @@ -517,7 +537,9 @@ void mlx5_sd_cleanup(struct mlx5_core_dev *dev)
> primary_sd->dfs = NULL;
>
> sd_info(primary, "group id %#x, uncombined\n", sd->group_id);
> -out:
> + primary_sd->state = MLX5_SD_STATE_DOWN;
> +out_unlock:
> + mlx5_devcom_comp_unlock(sd->devcom);
> sd_unregister(dev);
> sd_cleanup(dev);
> }

Is there a race condition if a secondary PF calls mlx5_sd_init() while the
primary PF is running mlx5_sd_cleanup()?

In mlx5_sd_cleanup(), the code sets the state to MLX5_SD_STATE_DOWN and drops
the devcom lock before calling sd_unregister(). At this point, the primary
PF is still registered in the devcom group, so mlx5_devcom_comp_is_ready()
remains true.

If a secondary PF is concurrently probed, could it acquire the devcom lock
and observe this intermediate state?

CPU1
mlx5_sd_cleanup()
primary_sd->state = MLX5_SD_STATE_DOWN;
mlx5_devcom_comp_unlock(sd->devcom);
// Preempted before sd_unregister(dev)

CPU2 would see this in mlx5_sd_init():
mlx5_devcom_comp_lock(sd->devcom);
if (!mlx5_devcom_comp_is_ready(sd->devcom)) // returns true
goto out;

primary = mlx5_sd_get_primary(dev);
primary_sd = mlx5_get_sd(primary);

if (primary_sd->state == MLX5_SD_STATE_UP) // returns false
goto out;
...
debugfs_create_file("primary", 0400, primary_sd->dfs, primary, &dev_fops);

Would CPU2 proceed to initialize the group using the dying primary PF?

If CPU1 then resumes and finishes sd_cleanup(), it would free the primary_sd
structure and the primary device. This might leave the secondary PF with
dangling pointers in the debugfs files, potentially leading to a
use-after-free on access.

Furthermore, when the secondary is later removed, mlx5_sd_cleanup(secondary)
might see is_ready as false and skip the teardown entirely, leaving the
dangling debugfs files permanently.