Re: [PATCH mlx5-vhost v2 01/10] net/mlx5: Support throttled commands from async API

From: Dragos Tatulea
Date: Mon Sep 09 2024 - 05:35:54 EST




On 16.08.24 11:01, Dragos Tatulea wrote:
> Currently, commands that qualify as throttled can't be used via the
> async API. That's due to the fact that the throttle semaphore can sleep
> but the async API can't.
>
> This patch allows throttling in the async API by using the tentative
> variant of the semaphore and upon failure (semaphore at 0) returns EBUSY
> to signal to the caller that they need to wait for the completion of
> previously issued commands.
>
> Furthermore, make sure that the semaphore is released in the callback.
>
> Signed-off-by: Dragos Tatulea <dtatulea@xxxxxxxxxx>
> Cc: Leon Romanovsky <leonro@xxxxxxxxxx>
> Reviewed-by: Tariq Toukan <tariqt@xxxxxxxxxx>
Same reminder as in v1: Tariq is the maintainer for mlx5 so his review
also counts as Acked-by.

Thanks,
Dragos

> ---
> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 ++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> index 20768ef2e9d2..f69c977c1569 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> @@ -1882,10 +1882,12 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
>
> throttle_op = mlx5_cmd_is_throttle_opcode(opcode);
> if (throttle_op) {
> - /* atomic context may not sleep */
> - if (callback)
> - return -EINVAL;
> - down(&dev->cmd.vars.throttle_sem);
> + if (callback) {
> + if (down_trylock(&dev->cmd.vars.throttle_sem))
> + return -EBUSY;
> + } else {
> + down(&dev->cmd.vars.throttle_sem);
> + }
> }
>
> pages_queue = is_manage_pages(in);
> @@ -2091,10 +2093,19 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work)
> {
> struct mlx5_async_work *work = _work;
> struct mlx5_async_ctx *ctx;
> + struct mlx5_core_dev *dev;
> + u16 opcode;
>
> ctx = work->ctx;
> - status = cmd_status_err(ctx->dev, status, work->opcode, work->op_mod, work->out);
> + dev = ctx->dev;
> + opcode = work->opcode;
> + status = cmd_status_err(dev, status, work->opcode, work->op_mod, work->out);
> work->user_callback(status, work);
> + /* Can't access "work" from this point on. It could have been freed in
> + * the callback.
> + */
> + if (mlx5_cmd_is_throttle_opcode(opcode))
> + up(&dev->cmd.vars.throttle_sem);
> if (atomic_dec_and_test(&ctx->num_inflight))
> complete(&ctx->inflight_done);
> }