Re: [PATCH 1/1] RDMA/mlx5: Release CPU for other processes in mlx5_free_cmd_msg()

From: Anand Khoje
Date: Wed May 29 2024 - 08:01:34 EST



On 5/26/24 20:53, Shay Drori wrote:
Hi Anand.

First, the correct Mailing list for this patch is
netdev@xxxxxxxxxxxxxxx, please send there the next version.

On 22/05/2024 6:32, Anand Khoje wrote:
In non FLR context, at times CX-5 requests release of ~8 million device pages.
This needs humongous number of cmd mailboxes, which to be released once
the pages are reclaimed. Release of humongous number of cmd mailboxes
consuming cpu time running into many secs, with non preemptable kernels
is leading to critical process starving on that cpu’s RQ. To alleviate
this, this patch relinquishes cpu periodically but conditionally.

Orabug: 36275016

this doesn't seem relevant


Signed-off-by: Anand Khoje <anand.a.khoje@xxxxxxxxxx>
---
  drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 7 +++++++
  1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 9c21bce..9fbf25d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1336,16 +1336,23 @@ static struct mlx5_cmd_msg *mlx5_alloc_cmd_msg(struct mlx5_core_dev *dev,
      return ERR_PTR(err);
  }
  +#define RESCHED_MSEC 2


What if you add cond_resched() on every iteration of the loop ? Does it
take much more time to finish 8 Million pages or same ?
If it does matter, maybe 2 ms is too high freq ? 20 ms ? 200 ms ?

Shay,


There is no rule we could use, but can use only guidance/suggestions here.
Delay if too short/often relinquish leads to thrashing and high context switch costs,
while keeping it long/infrequent relinquish leads to RQ starvation.
This observation is based  on our applications / workload, using which a middle ground was chosen as 2 msecs.
But your suggestions are also very viable. Hence we are reconsidering it.

This was very helpful. thank you! I will resend a v2 after more testing.

Thanks,

Anand


Thanks

  static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev,
                    struct mlx5_cmd_msg *msg)
  {
      struct mlx5_cmd_mailbox *head = msg->next;
      struct mlx5_cmd_mailbox *next;
+    unsigned long start_time = jiffies;
        while (head) {
          next = head->next;
          free_cmd_box(dev, head);
          head = next;
+        if (time_after(jiffies, start_time + msecs_to_jiffies(RESCHED_MSEC))) {
+            mlx5_core_warn_rl(dev, "Spent more than %d msecs, yielding CPU\n", RESCHED_MSEC);
+            cond_resched();
+            start_time = jiffies;
+        }
      }
      kfree(msg);
  }