On 14/12/2019 13:56, Marc Zyngier wrote:
On Fri, 13 Dec 2019 15:43:07 +0000
John Garry <john.garry@xxxxxxxxxx> wrote:
[...]
john@ubuntu:~$ ./dump-io-irq-affinityThe NUMA selection code definitely gets in the way. And to be honest,
kernel version:
Linux ubuntu 5.5.0-rc1-00003-g7adc5d7ec1ca-dirty #1440 SMP PREEMPT Fri Dec 13 14:53:19 GMT 2019 aarch64 aarch64 aarch64 GNU/Linux
PCI name is 04:00.0: nvme0n1
irq 56, cpu list 75, effective list 5
irq 60, cpu list 24-28, effective list 10
this NUMA thing is only there for the benefit of a terminally broken
implementation (Cavium ThunderX), which we should have never supported
the first place.
Let's rework this and simply use the managed affinity whenever
available instead. It may well be that it will break TX1, but I care
about it just as much as Cavium/Marvell does...
I'm just wondering if non-managed interrupts should be included in
the load balancing calculation? Couldn't irqbalance (if active) start
moving non-managed interrupts around anyway?
Please give this new patch a shot on your system (my D05 doesn't have
any managed devices):
We could consider supporting platform msi managed interrupts, but I
doubt the value.
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=irq/its-balance-mappings&id=1e987d83b8d880d56c9a2d8a86289631da94e55a
I quickly tested that in my NVMe env, and I see a performance boost
of 1055K -> 1206K IOPS. Results at bottom.
Here's the irq mapping dump:
I'm still getting the CPU lockup (even on CPUs which have a single
NVMe completion interrupt assigned), which taints these results. That
lockup needs to be fixed.
We'll check on our SAS env also. I did already hack something up
similar to your change and again we saw a boost there.