Re: Virtio-scsi multiqueue irq affinity

From: liaochang (A)
Date: Mon May 17 2021 - 21:37:32 EST


Thomas,

在 2021/5/10 15:54, Thomas Gleixner 写道:
> Liao,
>
> On Mon, May 10 2021 at 11:19, liaochang wrote:
>> 1.We have a machine with 36 CPUs,and assign several RT threads to last
>> two CPUs(CPU-34, CPU-35).
>
> Which kind of machine? x86?
>
>> 2.I/O device driver create single managed irq, the affinity of which
>> includes CPU-34 and CPU-35.
>
> If that driver creates only a single managed interrupt, then the
> possible affinity of that interrupt spawns CPUs 0 - 35.
>
> That's expected, but what is the effective affinity of that interrupt?
>
> # cat /proc/irq/$N/effective_affinity
>
> Also please provide the full output of
>
> # cat /proc/interrupts
>
> and point out which device we are talking about.

the mentioned managed irq is registered by virtio-scsi driver over PCI (on X86 platform, VM with 4 vCPU),
as shown below.

#lspci -vvv
...
00:04.0 SCSI storage controller: Virtio: Virtio SCSI
Subsystem: Virtio: Device 0008
Physical Slot: 4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at c140 [size=64]
Region 1: Memory at febd2000 (32-bit, non-prefetchable) [size=4K]
Region 4: Memory at fe004000 (64-bit, prefetchable) [size=16K]
Capabilities: [98] MSI-X: Enable+ Count=4 Masked-
Vector table: BAR=1 offset=00000000
PBA: BAR=1 offset=00000800

#ls /sys/bus/pci/devices/0000:00:04.0/msi_irqs
33 34 35 36

#cat /proc/interrupts
...
33: 0 0 0 0 PCI-MSI 65536-edge virtio1-config
34: 0 0 0 0 PCI-MSI 65537-edge virtio1-control
35: 0 0 0 0 PCI-MSI 65538-edge virtio1-event
36: 10637 0 0 0 PCI-MSI 65539-edge virtio1-request

As you see, virtio-scsi allocates four MSI-X interrupts,from 33 to 36, and the last one supposes to
be triggered when the data of virtqueue is ready to receive, then its interrupt handler will raise
ksoftirqd to process I/O.If I assign FIFO RT thread to CPU0, a simple I/O operation issued by command
"dd if=/dev/zero of=/test.img bs=1K cout=1 oflag=direct,sync" will never finish.

Although that's expected, do you think it is sort of risky for Linux availability? Given in cloud
based environment,services from different teams may have serious influence to each other because of
lack of enough communication or good understanding about infrastructure, Thanks.

This problem arises when RT thread and ksoftirqd scheduled on the same CPU, beside placing RT thread
carefully, I also tried to set "rq_affinity" as 2, but the cost is a performance degradation of some
I/O benchmark by 10%~30%. So I wonder if the affinity of managed irq supports configuration from user space
or via kernel bootargs? Thanks.

>
> Thanks,
>
> tglx
> .
>
BR,
Liao, Chang