Re: Kernel-managed IRQ affinity (cont)

From: Thomas Gleixner
Date: Fri Jan 10 2020 - 14:43:19 EST


Ming,

Ming Lei <ming.lei@xxxxxxxxxx> writes:
> On Thu, Jan 09, 2020 at 09:02:20PM +0100, Thomas Gleixner wrote:
>> Ming Lei <ming.lei@xxxxxxxxxx> writes:
>>
>> This is duct tape engineering with absolutely no semantics. I can't even
>> figure out the intent of this 'managed_irq' parameter.
>
> The intent is to isolate the specified CPUs from handling managed
> interrupt.

That's what I figured, but it still does not provide semantics and works
just for specific cases.

> We can do that. The big problem is that the RT case can't guarantee that
> IO won't be submitted from isolated CPU always. blk-mq's queue mapping
> relies on the setup affinity, so un-known behavior(kernel crash, or io
> hang, or other) may be caused if we exclude isolated CPUs from interrupt
> affinity.
>
> That is why I try to exclude isolated CPUs from interrupt effective affinity,
> turns out the approach is simple and doable.

Yes, it's doable. But it still is inconsistent behaviour. Assume the
following configuration:

8 CPUs CPU0,1 assigned for housekeeping

With 8 queues the proposed change does nothing because each queue is
mapped to exactly one CPU.

With 4 queues you get the following:

CPU0,1 queue 0
CPU2,3 queue 1
CPU4,5 queue 2
CPU6,7 queue 3

No effect on the isolated CPUs either.

With 2 queues you get the following:

CPU0,1,2,3 queue 0
CPU4,5,6,7 queue 1

So here the isolated CPUs 2 and 3 get the isolation, but 4-7
not. That's perhaps intended, but definitely not documented.

So you really need to make your mind up and describe what the intended
effect of this is and why you think that the result is correct.

Thanks,

tglx