Re: [PATCH net] net/mlx5: Fix error handling in mlx5_init_one_light()

From: Larysa Zaremba
Date: Fri May 10 2024 - 02:45:02 EST


On Thu, May 09, 2024 at 02:00:18PM +0300, Dan Carpenter wrote:
> If mlx5_query_hca_caps_light() fails then calling devl_unregister() or
> devl_unlock() is a bug. It's not registered and it's not locked. That
> will trigger a stack trace in this case because devl_unregister() checks
> both those things at the start of the function.
>
> If mlx5_devlink_params_register() fails then this code will call
> devl_unregister() and devl_unlock() twice which will again lead to a
> stack trace or possibly something worse as well.
>
> Fixes: bf729988303a ("net/mlx5: Restore mistakenly dropped parts in register devlink flow")
> Fixes: c6e77aa9dd82 ("net/mlx5: Register devlink first under devlink lock")

Reviewed-by: Larysa Zaremba <larysa.zaremba@xxxxxxxxx>

> Signed-off-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/main.c | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> index 331ce47f51a1..105c98160327 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> @@ -1690,7 +1690,7 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev)
> err = mlx5_query_hca_caps_light(dev);
> if (err) {
> mlx5_core_warn(dev, "mlx5_query_hca_caps_light err=%d\n", err);
> - goto query_hca_caps_err;
> + goto err_function_disable;
> }
>
> devl_lock(devlink);
> @@ -1699,18 +1699,16 @@ int mlx5_init_one_light(struct mlx5_core_dev *dev)
> err = mlx5_devlink_params_register(priv_to_devlink(dev));
> if (err) {
> mlx5_core_warn(dev, "mlx5_devlink_param_reg err = %d\n", err);
> - goto params_reg_err;
> + goto err_unregister;
> }
>
> devl_unlock(devlink);
> return 0;
>
> -params_reg_err:
> - devl_unregister(devlink);
> - devl_unlock(devlink);
> -query_hca_caps_err:
> +err_unregister:
> devl_unregister(devlink);
> devl_unlock(devlink);
> +err_function_disable:
> mlx5_function_disable(dev, true);
> out:
> dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
> --
> 2.43.0
>
>