Re: [PATCH v4 0/7] sched, net: NUMA-aware CPU spreading interface

From: Yury Norov
Date: Fri Sep 23 2022 - 11:45:19 EST


On Fri, Sep 23, 2022 at 02:25:20PM +0100, Valentin Schneider wrote:
> Hi folks,

Hi,

I received only 1st patch of the series. Can you give me a link for
the full series so that I'll see how the new API is used?

Thanks,
Yury

> Tariq pointed out in [1] that drivers allocating IRQ vectors would benefit
> from having smarter NUMA-awareness (cpumask_local_spread() doesn't quite cut
> it).
>
> The proposed interface involved an array of CPUs and a temporary cpumask, and
> being my difficult self what I'm proposing here is an interface that doesn't
> require any temporary storage other than some stack variables (at the cost of
> one wild macro).
>
> Please note that this is based on top of Yury's bitmap-for-next [2] to leverage
> his fancy new FIND_NEXT_BIT() macro.
>
> [1]: https://lore.kernel.org/all/20220728191203.4055-1-tariqt@xxxxxxxxxx/
> [2]: https://github.com/norov/linux.git/ -b bitmap-for-next
>
> A note on treewide use of for_each_cpu_andnot()
> ===============================================
>
> I've used the below coccinelle script to find places that could be patched (I
> couldn't figure out the valid syntax to patch from coccinelle itself):
>
> ,-----
> @tmpandnot@
> expression tmpmask;
> iterator for_each_cpu;
> position p;
> statement S;
> @@
> cpumask_andnot(tmpmask, ...);
>
> ...
>
> (
> for_each_cpu@p(..., tmpmask, ...)
> S
> |
> for_each_cpu@p(..., tmpmask, ...)
> {
> ...
> }
> )
>
> @script:python depends on tmpandnot@
> p << tmpandnot.p;
> @@
> coccilib.report.print_report(p[0], "andnot loop here")
> '-----
>
> Which yields (against c40e8341e3b3):
>
> .//arch/powerpc/kernel/smp.c:1587:1-13: andnot loop here
> .//arch/powerpc/kernel/smp.c:1530:1-13: andnot loop here
> .//arch/powerpc/kernel/smp.c:1440:1-13: andnot loop here
> .//arch/powerpc/platforms/powernv/subcore.c:306:2-14: andnot loop here
> .//arch/x86/kernel/apic/x2apic_cluster.c:62:1-13: andnot loop here
> .//drivers/acpi/acpi_pad.c:110:1-13: andnot loop here
> .//drivers/cpufreq/armada-8k-cpufreq.c:148:1-13: andnot loop here
> .//drivers/cpufreq/powernv-cpufreq.c:931:1-13: andnot loop here
> .//drivers/net/ethernet/sfc/efx_channels.c:73:1-13: andnot loop here
> .//drivers/net/ethernet/sfc/siena/efx_channels.c:73:1-13: andnot loop here
> .//kernel/sched/core.c:345:1-13: andnot loop here
> .//kernel/sched/core.c:366:1-13: andnot loop here
> .//net/core/dev.c:3058:1-13: andnot loop here
>
> A lot of those are actually of the shape
>
> for_each_cpu(cpu, mask) {
> ...
> cpumask_andnot(mask, ...);
> }
>
> I think *some* of the powerpc ones would be a match for for_each_cpu_andnot(),
> but I decided to just stick to the one obvious one in __sched_core_flip().
>
> Revisions
> =========
>
> v3 -> v4
> ++++++++
>
> o Rebased on top of Yury's bitmap-for-next
> o Added Tariq's mlx5e patch
> o Made sched_numa_hop_mask() return cpu_online_mask for the NUMA_NO_NODE &&
> hops=0 case
>
> v2 -> v3
> ++++++++
>
> o Added for_each_cpu_and() and for_each_cpu_andnot() tests (Yury)
> o New patches to fix issues raised by running the above
>
> o New patch to use for_each_cpu_andnot() in sched/core.c (Yury)
>
> v1 -> v2
> ++++++++
>
> o Split _find_next_bit() @invert into @invert1 and @invert2 (Yury)
> o Rebase onto v6.0-rc1
>
> Cheers,
> Valentin
>
> Tariq Toukan (1):
> net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity
> hints
>
> Valentin Schneider (6):
> lib/find_bit: Introduce find_next_andnot_bit()
> cpumask: Introduce for_each_cpu_andnot()
> lib/test_cpumask: Add for_each_cpu_and(not) tests
> sched/core: Merge cpumask_andnot()+for_each_cpu() into
> for_each_cpu_andnot()
> sched/topology: Introduce sched_numa_hop_mask()
> sched/topology: Introduce for_each_numa_hop_cpu()
>
> drivers/net/ethernet/mellanox/mlx5/core/eq.c | 13 +++++-
> include/linux/cpumask.h | 39 ++++++++++++++++
> include/linux/find.h | 33 +++++++++++++
> include/linux/topology.h | 49 ++++++++++++++++++++
> kernel/sched/core.c | 5 +-
> kernel/sched/topology.c | 31 +++++++++++++
> lib/cpumask_kunit.c | 19 ++++++++
> lib/find_bit.c | 9 ++++
> 8 files changed, 192 insertions(+), 6 deletions(-)
>
> --
> 2.31.1