Re: [PATCH 2/2] net/mlx5: Only consider online CPUs in affinity subset check

From: Shay Drori

Date: Thu Jun 04 2026 - 01:54:22 EST




On 03/06/2026 10:26, Fushuai Wang wrote:
External email: Use caution opening links or attachments


From: Fushuai Wang <wangfushuai@xxxxxxxxx>

When an SF is created after a CPU has been taken offline, the IRQ pool may
contain IRQs with affinity masks that include the offline CPU. Since only
online CPUs should be considered for IRQ placement, cpumask_subset() check
would fail because the iter_mask contains offline CPUs that are not present
in req_mask, causing SF creation to fail.

Thank for the patch!

can you please provide a full example? for simplicity, lets say the SF
pool is of size of 2 IRQs.


Filter the affinity mask to only include online CPUs before checking if it's
a subset of the requested mask,

won't this cause the affinity mask to be empty, which is kind of missing
the point of this API... :(
can you check if irq_get_effective_affinity_mask() will solve the issue?

Thanks

ensuring SF creation succeeds in this scenario.

Fixes: 061f5b23588a ("net/mlx5: SF, Use all available cpu for setting cpu affinity")
Signed-off-by: Fushuai Wang <wangfushuai@xxxxxxxxx>
---
.../net/ethernet/mellanox/mlx5/core/irq_affinity.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
index 994fe83da4be..8c0df240b888 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
@@ -102,18 +102,26 @@ irq_pool_find_least_loaded(struct mlx5_irq_pool *pool, const struct cpumask *req
struct mlx5_irq *iter;
int irq_refcount = 0;
unsigned long index;
+ cpumask_var_t tmp;

lockdep_assert_held(&pool->lock);
+
+ if (!alloc_cpumask_var(&tmp, GFP_ATOMIC))
+ return NULL;
+
xa_for_each_range(&pool->irqs, index, iter, start, end) {
struct cpumask *iter_mask = mlx5_irq_get_affinity_mask(iter);
int iter_refcount = mlx5_irq_read_locked(iter);

- if (!cpumask_subset(iter_mask, req_mask))
+ cpumask_and(tmp, iter_mask, cpu_online_mask);
+ if (!cpumask_subset(tmp, req_mask))
/* skip IRQs with a mask which is not subset of req_mask */
continue;
- if (iter_refcount < pool->min_threshold)
+ if (iter_refcount < pool->min_threshold) {
/* If we found an IRQ with less than min_thres, return it */
+ free_cpumask_var(tmp);
return iter;
+ }
if (!irq || iter_refcount < irq_refcount) {
/* In case we won't find an IRQ with less than min_thres,
* keep a pointer to the least used IRQ
@@ -122,6 +130,8 @@ irq_pool_find_least_loaded(struct mlx5_irq_pool *pool, const struct cpumask *req
irq = iter;
}
}
+
+ free_cpumask_var(tmp);
return irq;
}

--
2.36.1