Re: [PATCH 02/13] irq: Introduce IRQD_AFFINITY_MANAGED flag

From: Bart Van Assche
Date: Wed Jun 15 2016 - 15:37:19 EST


On 06/15/2016 06:03 PM, Keith Busch wrote:
On Wed, Jun 15, 2016 at 05:28:54PM +0200, Bart Van Assche wrote:
On 06/15/2016 05:14 PM, Keith Busch wrote:
I think the idea is have the irq_affinity mask match the CPU mapping on
the submission side context associated with that particular vector. If
two identical adapters generate the same submission CPU mapping, I don't
think we can do better than matching irq_affinity masks.

Has this been verified by measurements? Sorry but I'm not convinced that
using the same mapping for multiple identical adapters instead of spreading
interrupts will result in better performance.

The interrupts automatically spread based on which CPU submitted the
work. If you want to spread interrupts across more CPUs, then you can
spread submissions to the CPUs you want to service the interrupts.

Completing work on the same CPU that submitted it is quickest with
its cache hot access. I have equipment available to demo this. What
affinty_mask policy would you like to see compared with the proposal?

Hello Keith,

Sorry that I had not yet this made this clear but my concern is about a system equipped with two or more adapters and with more CPU cores than the number of MSI-X interrupts per adapter. Consider e.g. a system with two adapters (A and B), 8 interrupts per adapter (A0..A7 and B0..B7), 32 CPU cores and two NUMA nodes. Assuming that hyperthreading is disabled, will the patches from this patch series generate the following interrupt assignment?

0: A0 B0
1: A1 B1
2: A2 B2
3: A3 B3
4: A4 B4
5: A5 B5
6: A6 B6
7: A7 B7
8: (none)
...
31: (none)

The mapping I would like to see is as follows (assuming CPU cores 0..15 correspond to NUMA node 0 and CPU cores 16..31 correspond to NUMA node 1):

0: A0
1: B0
2: (none)
3: (none)
4: A1
5: B1
6: (none)
7: (none)
8: A2
9: B2
10: (none)
11: (none)
12: A3
13: B3
14: (none)
15: (none)
...
31: (none)

Do you agree that - ignoring other interrupt assignments - that the latter interrupt assignment scheme would result in higher throughput and lower interrupt processing latency?

Thanks,

Bart.