[PATCH net 2/6] net/mlx5: Fix misidentification of write combining CQE during poll loop

From: Tariq Toukan

Date: Thu Feb 12 2026 - 05:33:43 EST


From: Gal Pressman <gal@xxxxxxxxxx>

The write combining completion poll loop uses usleep_range() which can
sleep much longer than requested due to scheduler latency. Under load,
we witnessed a 20ms+ delay until the process was rescheduled, causing
the jiffies based timeout to expire while the thread is sleeping.

The original do-while loop structure (poll, sleep, check timeout) would
exit without a final poll when waking after timeout, missing a CQE that
arrived during sleep.

Restructure the loop by moving the poll into the while condition,
ensuring we always poll after sleeping, catching CQEs that arrived
during that time.
While at it, remove the redundant 'err' assignment.

Fixes: d98995b4bf98 ("net/mlx5: Reimplement write combining test")
Signed-off-by: Gal Pressman <gal@xxxxxxxxxx>
Reviewed-by: Jianbo Liu <jianbol@xxxxxxxxxx>
Signed-off-by: Tariq Toukan <tariqt@xxxxxxxxxx>
---
drivers/net/ethernet/mellanox/mlx5/core/wc.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wc.c b/drivers/net/ethernet/mellanox/mlx5/core/wc.c
index 815a7c97d6b0..29db15c4b978 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wc.c
@@ -390,12 +390,10 @@ static void mlx5_core_test_wc(struct mlx5_core_dev *mdev)
mlx5_wc_post_nop(sq, &offset, true);

expires = jiffies + TEST_WC_POLLING_MAX_TIME_JIFFIES;
- do {
- err = mlx5_wc_poll_cq(sq);
- if (err)
- usleep_range(2, 10);
- } while (mdev->wc_state == MLX5_WC_STATE_UNINITIALIZED &&
- time_is_after_jiffies(expires));
+ while ((mlx5_wc_poll_cq(sq),
+ mdev->wc_state == MLX5_WC_STATE_UNINITIALIZED) &&
+ time_is_after_jiffies(expires))
+ usleep_range(2, 10);

mlx5_wc_destroy_sq(sq);

--
2.44.0