[PATCH AUTOSEL 5.10 35/36] Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs"

From: Sasha Levin
Date: Mon Feb 08 2021 - 14:30:09 EST


From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

[ Upstream commit 2452483d9546de1c540f330469dc4042ff089731 ]

This reverts commit 1abdfe706a579a702799fce465bceb9fb01d407c.

This change is broken and not solving any problem it claims to solve.

Robin reported that cpumask_local_spread() now returns any cpu out of
cpu_possible_mask in case that NOHZ_FULL is disabled (runtime or compile
time). It can also return any offline or not-present CPU in the
housekeeping mask. Before that it was returning a CPU out of
online_cpu_mask.

While the function is racy against CPU hotplug if the caller does not
protect against it, the actual use cases are not caring much about it as
they use it mostly as hint for:

- the user space affinity hint which is unused by the kernel
- memory node selection which is just suboptimal
- network queue affinity which might fail but is handled gracefully

But the occasional fail vs. hotplug is very different from returning
anything from possible_cpu_mask which can have a large amount of offline
CPUs obviously.

The changelog of the commit claims:

"The current implementation of cpumask_local_spread() does not respect
the isolated CPUs, i.e., even if a CPU has been isolated for Real-Time
task, it will return it to the caller for pinning of its IRQ
threads. Having these unwanted IRQ threads on an isolated CPU adds up
to a latency overhead."

The only correct part of this changelog is:

"The current implementation of cpumask_local_spread() does not respect
the isolated CPUs."

Everything else is just disjunct from reality.

Reported-by: Robin Murphy <robin.murphy@xxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Nitesh Narayan Lal <nitesh@xxxxxxxxxx>
Cc: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
Cc: abelits@xxxxxxxxxxx
Cc: davem@xxxxxxxxxxxxx
Link: https://lore.kernel.org/r/87y2g26tnt.fsf@xxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
lib/cpumask.c | 16 +++++-----------
1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/lib/cpumask.c b/lib/cpumask.c
index 85da6ab4fbb5a..fb22fb266f937 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -6,7 +6,6 @@
#include <linux/export.h>
#include <linux/memblock.h>
#include <linux/numa.h>
-#include <linux/sched/isolation.h>

/**
* cpumask_next - get the next cpu in a cpumask
@@ -206,27 +205,22 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
*/
unsigned int cpumask_local_spread(unsigned int i, int node)
{
- int cpu, hk_flags;
- const struct cpumask *mask;
+ int cpu;

- hk_flags = HK_FLAG_DOMAIN | HK_FLAG_MANAGED_IRQ;
- mask = housekeeping_cpumask(hk_flags);
/* Wrap: we always want a cpu. */
- i %= cpumask_weight(mask);
+ i %= num_online_cpus();

if (node == NUMA_NO_NODE) {
- for_each_cpu(cpu, mask) {
+ for_each_cpu(cpu, cpu_online_mask)
if (i-- == 0)
return cpu;
- }
} else {
/* NUMA first. */
- for_each_cpu_and(cpu, cpumask_of_node(node), mask) {
+ for_each_cpu_and(cpu, cpumask_of_node(node), cpu_online_mask)
if (i-- == 0)
return cpu;
- }

- for_each_cpu(cpu, mask) {
+ for_each_cpu(cpu, cpu_online_mask) {
/* Skip NUMA nodes, done above. */
if (cpumask_test_cpu(cpu, cpumask_of_node(node)))
continue;
--
2.27.0