Re: Interrupt Affinity in SMP

From: Bryan Hundven
Date: Sat Jul 17 2010 - 16:02:35 EST


On Sat, Jul 10, 2010 at 6:20 PM, Robert Hancock <hancockrwd@xxxxxxxxx> wrote:
> On Sat, Jul 10, 2010 at 1:46 PM, Bryan Hundven <bryanhundven@xxxxxxxxx> wrote:
>> I was able to set eth0 and it's TxRx queues to cpu1, but it is my
>> understanding that 0xFFFFFFFF should distribute the interrupts across all
>> cpus, much like LOC in my output of /proc/interrupts.
>>
>> I don't have access to the computer this weekend, but I will provide more
>> info on Monday.
>
> That may be chipset dependent, I don't think all chipsets have the
> ability to distribute the interrupts like that. Round-robin interrupt
> distribution for a given handler isn't optimal for performance anyway
> since it causes the relevant cache lines for the interrupt handler to
> be ping-ponged between the different CPUs.
>
>>
>> -bryan
>>
>> On Jul 9, 2010 5:48 PM, "Robert Hancock" <hancockrwd@xxxxxxxxx> wrote:
>>
>> On 07/09/2010 04:59 PM, Bryan Hundven wrote:
>>>
>>> Mauro, list,
>>>
>>> (please CC me in replies, I am not...
>>
>> Tried changing these files to exclude CPU0?
>>
>> Have you tried running the irqbalance daemon? That's what you likely want to
>> be doing anyway..
>>
>>> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
>>>
>>> =====8<=====8<=====8<==...
>

Please see the two attached examples.

Notice on the 5410 example how we start with the affinity set to 0xff,
and change it to 0xf0.
This should spread the interrupts over the last 4 cores of this quad
core - dual processor system.

Also notice on the 5645 example, with the same commands we start with
0xffffff and change to 0xfff000 to spread the interrupts over the last
12 cores, but only the first of the last twelve cores receive
interrupts.

This is the inconsistency I was trying to explain before.

--Bryan
# uname -a
Linux hustle 2.6.32-3-amd64 #1 SMP Wed Feb 24 18:07:42 UTC 2010 x86_64 GNU/Linux

# cat /proc/cpuinfo
...
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz
stepping : 6
cpu MHz : 2327.577
cache size : 6144 KB
physical id : 1
siblings : 4
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips : 4655.33
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

# Header for /proc/interrupts for reference...
# CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7

# grep '^ 64' /proc/interrupts && sleep 15 && grep '^ 64' /proc/interrupts
64: 8201417 8201529 8200276 8197421 8200250 8200304 8201232 8201092 PCI-MSI-edge eth0
64: 8201430 8201542 8200288 8197432 8200262 8200316 8201243 8201106 PCI-MSI-edge eth0

# echo "f0" > /proc/irq/64/smp_affinity
# grep '^ 64' /proc/interrupts && sleep 15 && grep '^ 64' /proc/interrupts
64: 8201476 8201595 8200332 8197474 8200311 8200364 8201292 8201159 PCI-MSI-edge eth0
64: 8201476 8201595 8200332 8197474 8200327 8200381 8201308 8201175 PCI-MSI-edge eth0

# Note that now interrupts are only ocurring on the last four processors
# uname -a
Linux (none) 2.6.35-rc4 #1 SMP Fri Jul 9 16:15:15 PDT 2010 i686 unknown

# cat /proc/cpuinfo
...
processor : 23
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz
stepping : 2
cpu MHz : 2399.870
cache size : 12288 KB
physical id : 1
siblings : 12
core id : 10
cpu cores : 6
apicid : 53
initial apicid : 53
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm arat tpr_shadow vnmi flexpriority ept vpid
bogomips : 4801.93
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

# cat /proc/irq/86/smp_affinity
ffffff

# Header for /proc/interrupts for reference...
# CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 CPU16 CPU17 CPU18 CPU19 CPU20 CPU21 CPU22 CPU23

# grep '^ 86' /proc/interrupts && sleep 15 && grep '^ 86' /proc/interrupts
86: 87570 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge eth0-TxRx-0
86: 87577 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge eth0-TxRx-0

# echo "fff000" > /proc/irq/86/smp_affinity
# grep '^ 86' /proc/interrupts && sleep 15 && grep '^ 86' /proc/interrupts
86: 87627 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge eth0-TxRx-0
86: 87627 0 0 0 0 0 0 0 0 0 0 0 9 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge eth0-TxRx-0

# Notice the difference between the 5410 and the 5645