Mellanox interrupts are not load balanced

From: katiyar26@xxxxxxxxxxx
Date: Mon Jan 30 2023 - 00:44:22 EST


Hi,
I am running centos 7.7 VM in azure with Mellanox (mlx5_core) driver for NIC. It is running customized 3.10.0-1062.18.1.el7 kernel image with some minor changes in net directory.

It has created as many queues and irqs as the number of CPUs in VM but all the interrupts are being processed by CPU0 only. Irqbalance service is also running and smp_affinity is set differently for different irqs. I tried setting it manually after stopping the irqbalance service but still all the interrupts were targeted to CPU0 as can be seen from below output.

> cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7      
  0:       9881          0          0          0          0          0          0          0   IO-APIC-edge      timer
  1:          0          0          0          0          0          0          0          9   IO-APIC-edge      i8042
  3:         21         25         13         19          2          2          3        856   IO-APIC-edge   
  4:         68          6         25         22         21         10         19        360   IO-APIC-edge      serial
  8:          0          0          0          0          0          0          0          0   IO-APIC-edge      rtc0
  9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
12:          0          0          0          0          0          0          0          5   IO-APIC-edge      i8042
14:        602        318        226        232        278        205         69       8917   IO-APIC-edge      ata_piix
15:          0          0          0          0          0          0          0          0   IO-APIC-edge      ata_piix
24:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_pages_eq@pci:8b76:00:02.0
25:      19694          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_cmd_eq@pci:8b76:00:02.0
26:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_async_eq@pci:8b76:00:02.0
28:     123648          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp0@pci:8b76:00:02.0
29:     152455          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp1@pci:8b76:00:02.0
30:     102308          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp2@pci:8b76:00:02.0
31:      89403          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp3@pci:8b76:00:02.0
32:      86793          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp4@pci:8b76:00:02.0
33:     107817          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp5@pci:8b76:00:02.0
34:     117091          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp6@pci:8b76:00:02.0
35:      59714          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp7@pci:8b76:00:02.0
36:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_pages_eq@pci:83a4:00:02.0
37:      12427          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_cmd_eq@pci:83a4:00:02.0
38:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_async_eq@pci:83a4:00:02.0
40:      35520          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp0@pci:83a4:00:02.0
41:        576          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp1@pci:83a4:00:02.0
42:      34139          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp2@pci:83a4:00:02.0
43:      19951          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp3@pci:83a4:00:02.0
44:      41038          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp4@pci:83a4:00:02.0
45:      36569          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp5@pci:83a4:00:02.0
46:      42023          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp6@pci:83a4:00:02.0
47:      12610          0          0          0          0          0          0          0   PCI-MSI-edge      mlx5_comp7@pci:83a4:00:02.0
NMI:          0          0          0          0          0          0          0          0   Non-maskable interrupts
LOC:       1536       1224       1240       1107       1299       1379       1171       2152   Local timer interrupts
SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0          0          0          0          0   Performance monitoring interrupts
IWI:        726        143        776        309        780        370        748       1047   IRQ work interrupts
RTR:          0          0          0          0          0          0          0          0   APIC ICR read retries
RES:      59746      34162     150579      45146     149421      87954     149095      47137   Rescheduling interrupts
CAL:       2562       2717       2601       2590       2577       2649       2572       2557   Function call interrupts

Mellanox driver version is :
version:        5.0-0
license:        Dual BSD/GPL
description:    Mellanox 5th generation network adapters (ConnectX series) core driver
author:         Eli Cohen <eli@xxxxxxxxxxxx>
rhelversion:    7.7
srcversion:     7D9FFD656B0EB1000804CB2

Same kernel with different NIC driver (in AWS) and igb driver in physical server works fine.
I tried centos7.9 (3.10.0-1160.76.1.el7) available in Azure market place and there I don't see the issue.

Please help in debugging/resolving this issue.

Please CC to katiyar26@xxxxxxxxxxx while replying.

regards,
Nitin