Re: Virtio-scsi multiqueue irq affinity

From: liaochang (A)
Date: Sun May 09 2021 - 23:19:54 EST


Hi Thomas,

在 2021/5/8 20:26, Thomas Gleixner 写道:
> Yihang,
>
> On Sat, May 08 2021 at 15:52, xuyihang wrote:
>>
>> We are dealing with a scenario which may need to assign a default
>> irqaffinity for managed IRQ.
>>
>> Assume we have a full CPU usage RT thread running binded to a specific
>> CPU.
>>
>> In the mean while, interrupt handler registered by a device which is
>> ksoftirqd may never have a chance to run. (And we don't want to use
>> isolate CPU)
>
> A device cannot register and interrupt handler in ksoftirqd.

I learn the scenario further after communicate with Yihang offline:
1.We have a machine with 36 CPUs,and assign several RT threads to last two CPUs(CPU-34, CPU-35).
2.I/O device driver create single managed irq, the affinity of which includes CPU-34 and CPU-35.
3.Another regular application launch I/O operation at different CPUs with the ones RT threads use,
then CPU-34/35 will receive hardware interrupt and wakeup ksoftirqd to deal with real I/O stuff.
4.Cause the priority and schedule policy of RT thread overwhlem per-cpu ksoftirqd, it looks like
ksoftirqd has no chance to run at CPU-34/35,which leads to I/O processing can't finish at time,
and application get stuck.

>
>> There could be a couple way to deal with this problem:
>>
>> 1. Adjust priority of ksoftirqd or RT thread, so the interrupt handler
>> could preempt
>>
>> RT thread. However, I am not sure whether it could have some side
>> effects or not.
>>
>> 2. Adjust interrupt CPU affinity or RT thread affinity. But managed IRQ
>> seems design to forbid user from manipulating interrupt affinity.
>>
>> It seems managed IRQ is coupled with user side application to me.
>>
>> Would you share your thoughts about this issue please?
>
> Can you please provide a more detailed description of your system?
>
> - Number of CPUs
>
> - Kernel version
> - Is NOHZ full enabled?
> - Any isolation mechanisms enabled, and if so how are they
> configured (e.g. on the kernel command line)?
>
> - Number of queues in the multiqueue device
>
> - Is the RT thread issuing I/O to the multiqueue device?
>
> Thanks,
>
> tglx
> .
>
BR,
Liao Chang