Re: [PATCH] sched/topology: optimize sched_numa_find_nth_cpu()
From: Shrikanth Hegde
Date: Fri Mar 27 2026 - 09:41:31 EST
Hi Valentin.
On 3/27/26 1:15 AM, Valentin Schneider wrote:
On 26/03/26 20:09, Valentin Schneider wrote:
On 19/03/26 13:26, Yury Norov wrote:
The binary search callback hop_cmp() uses cpumask_weight_and() on each
iteration. Switch it to cpumask_nth_and() as it returns earlier, as
soon as the required number of CPUs is found.
Signed-off-by: Yury Norov <ynorov@xxxxxxxxxx>
Woopsie, forget about the empty reply.
Took me a little while to get back on track with how
sched_numa_find_nth_cpu() works. Doesn't help that it and the
cpumask_nth*() family have a @cpu parameter when it isn't a CPU but an
offset from a search start (i.e. an index, as coined for cpumask_local_spread()).
Comments would help,
Yes. It is quite difficult to get at first glance.
I'll try to come up with something tomorrow if I wake
up from whatever's brewing in my lungs :(
I figured I'd give it a try before the fever kicks in:
Take care.
---
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 32dcddaead82d..7069179d5ee0c 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2267,6 +2267,7 @@ static int hop_cmp(const void *a, const void *b)
struct cpumask **prev_hop, **cur_hop = *(struct cpumask ***)b;
struct __cmp_key *k = (struct __cmp_key *)a;
+ /* Not enough CPUs reachable in that many hops */
if (cpumask_weight_and(k->cpus, cur_hop[k->node]) <= k->cpu)
return 1;
@@ -2275,6 +2276,10 @@ static int hop_cmp(const void *a, const void *b)
return 0;
}
+ /*
+ * cur_hop spans enough CPUs to return an nth one, if the immediately
+ * preceding hop doesn't then we're done.
+ */
prev_hop = *((struct cpumask ***)b - 1);
k->w = cpumask_weight_and(k->cpus, prev_hop[k->node]);
if (k->w <= k->cpu)
@@ -2284,16 +2289,16 @@ static int hop_cmp(const void *a, const void *b)
}
/**
- * sched_numa_find_nth_cpu() - given the NUMA topology, find the Nth closest CPU
- * from @cpus to @cpu, taking into account distance
- * from a given @node.
+ * sched_numa_find_nth_cpu() - given the NUMA topology, find the @nth_cpu in
+ * @cpus reachable from @node in the least amount
+ * of hops.
* @cpus: cpumask to find a cpu from
- * @cpu: CPU to start searching
- * @node: NUMA node to order CPUs by distance
+ * @nth_cpu: CPU offset to search for
+ * @node: NUMA node to start the search from
*
* Return: cpu, or nr_cpu_ids when nothing found.
*/
-int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
+int sched_numa_find_nth_cpu(const struct cpumask *cpus, int nth_cpu, int node)
{
struct __cmp_key k = { .cpus = cpus, .cpu = cpu };
nit: it should .cpu = nth_cpu? and one more place below should change to nth_cpu.
struct cpumask ***hop_masks;
@@ -2315,8 +2320,21 @@ int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
hop_masks = bsearch(&k, k.masks, sched_domains_numa_levels, sizeof(k.masks[0]), hop_cmp);
if (!hop_masks)
goto unlock;
+ /*
+ * bsearch returned sched_domains_numa_masks[hop], with @hop being the
+ * smallest amount of hops it takes to reach an @nth_cpu from @node.
+ */
hop = hop_masks - k.masks;
+ /*
+ * @hop is constructed by hop_cmp() such that sched_domains_numa_masks[hop][node]
+ * spans enough CPUs to return an @nth_cpu, and sched_domains_numa_masks[hop-1][node]
+ * doesn't.
+ *
+ * Get a cpumask without the CPUs from sched_domains_numa_masks[hop-1][node]
+ * subtract how many CPUs that contains (@k.w), and fetch our @nth_cpu from
+ * the resulting mask.
+ */
ret = hop ?
cpumask_nth_and_andnot(cpu - k.w, cpus, k.masks[hop][node], k.masks[hop-1][node]) :
cpumask_nth_and(cpu, cpus, k.masks[0][node]);