On Wed, May 22, 2024 at 09:02:56AM +0530, Anand Khoje wrote:
In non FLR context, at times CX-5 requests release of ~8 million device pages.Did you consider to make this function asynchronous and parallel?
This needs humongous number of cmd mailboxes, which to be released once
the pages are reclaimed. Release of humongous number of cmd mailboxes
consuming cpu time running into many secs, with non preemptable kernels
is leading to critical process starving on that cpu’s RQ. To alleviate
this, this patch relinquishes cpu periodically but conditionally.
Orabug: 36275016
Signed-off-by: Anand Khoje <anand.a.khoje@xxxxxxxxxx>
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 9c21bce..9fbf25d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1336,16 +1336,23 @@ static struct mlx5_cmd_msg *mlx5_alloc_cmd_msg(struct mlx5_core_dev *dev,
return ERR_PTR(err);
}
+#define RESCHED_MSEC 2
static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev,
struct mlx5_cmd_msg *msg)
{
struct mlx5_cmd_mailbox *head = msg->next;
struct mlx5_cmd_mailbox *next;
+ unsigned long start_time = jiffies;
while (head) {
next = head->next;
free_cmd_box(dev, head);
Thanks
head = next;
+ if (time_after(jiffies, start_time + msecs_to_jiffies(RESCHED_MSEC))) {
+ mlx5_core_warn_rl(dev, "Spent more than %d msecs, yielding CPU\n", RESCHED_MSEC);
+ cond_resched();
+ start_time = jiffies;
+ }
}
kfree(msg);
}
--
1.8.3.1