Re: Virtio-scsi multiqueue irq affinity

From: xuyihang
Date: Mon May 10 2021 - 04:48:46 EST


Thomas,

在 2021/5/8 20:26, Thomas Gleixner 写道:
Yihang,

On Sat, May 08 2021 at 15:52, xuyihang wrote:
We are dealing with a scenario which may need to assign a default
irqaffinity for managed IRQ.

Assume we have a full CPU usage RT thread running binded to a specific
CPU.

In the mean while, interrupt handler registered by a device which is
ksoftirqd may never have a chance to run. (And we don't want to use
isolate CPU)
A device cannot register and interrupt handler in ksoftirqd.

There could be a couple way to deal with this problem:

1. Adjust priority of ksoftirqd or RT thread, so the interrupt handler
could preempt

RT thread. However, I am not sure whether it could have some side
effects or not.

2. Adjust interrupt CPU affinity or RT thread affinity. But managed IRQ
seems design to forbid user from manipulating interrupt affinity.

It seems managed IRQ is coupled with user side application to me.

Would you share your thoughts about this issue please?
Can you please provide a more detailed description of your system?

- Number of CPUs
It's a 4 CPU x86 VM.
- Kernel version
This experiment run on linux-4.19
- Is NOHZ full enabled?
nohz=off
- Any isolation mechanisms enabled, and if so how are they
configured (e.g. on the kernel command line)?

Some core is isolated by command line (such as : isolcpus=3), and bind

with RT thread, and no other isolation configure.

- Number of queues in the multiqueue device

Only one queue.

[root@localhost ~]# cat /proc/interrupts | grep request
 27:       5499          0          0          0   PCI-MSI 65539-edge      virtio1-request

This environment is a virtual machine and it's a virtio device, I guess it

should not make any difference in this case.

- Is the RT thread issuing I/O to the multiqueue device?

The RT thread doesn't issue IO.



We simplified the reproduce procedure:

1. Start a busy loopping program that have near 100% cpu usage, named print

./print 1 1 &


2. Make the program become realtime application

chrt -f -p 1 11514


3. Bind the RT process to the **managed irq** core

taskset -cpa 0 11514


4. Use dd to write to hard drive, and dd could not finish and return.

dd if=/dev/zero of=/test.img bs=1K count=1 oflag=direct,sync &


Since CPU is fully utilized by RT application, and hard drive driver choose

CPU0 to handle it's softirq, there is no chance for dd to run.

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM TIME+ COMMAND
  11514 root      -2   0    2228    740    676 R 100.0   0.0 3:26.70 print


If we make some change on this experiment:

1.  Make this RT application use less CPU time instead of 100%, the problem

disappear.

2, If we change rq_affinity to 2, in order to avoid handle softirq on the same

core of RT thread, the problem also disappear. However, this approach

result in about 10%-30% random write proformance deduction comparing

to rq_affinity = 1, since it may has better cache utilization.

echo 2 > /sys/block/sda/queue/rq_affinity


Therefore, I want to exclude some CPU from managed irq on boot parameter,

which has simliar approach to 11ea68f553e2 ("genirq, sched/isolation: Isolate

from handling managed interrupts").


Thanks,

Yihang