Re: Question on handling managed IRQs when hotplugging CPUs

From: John Garry
Date: Tue Jan 29 2019 - 12:13:04 EST


On 29/01/2019 15:44, Keith Busch wrote:
On Tue, Jan 29, 2019 at 03:25:48AM -0800, John Garry wrote:
Hi,

I have a question on $subject which I hope you can shed some light on.

According to commit c5cb83bb337c25 ("genirq/cpuhotplug: Handle managed
IRQs on CPU hotplug"), if we offline the last CPU in a managed IRQ
affinity mask, the IRQ is shutdown.

The reasoning is that this IRQ is thought to be associated with a
specific queue on a MQ device, and the CPUs in the IRQ affinity mask are
the same CPUs associated with the queue. So, if no CPU is using the
queue, then no need for the IRQ.

However how does this handle scenario of last CPU in IRQ affinity mask
being offlined while IO associated with queue is still in flight?

Or if we make the decision to use queue associated with the current CPU,
and then that CPU (being the last CPU online in the queue's IRQ
afffinity mask) goes offline and we finish the delivery with another CPU?

In these cases, when the IO completes, it would not be serviced and timeout.

I have actually tried this on my arm64 system and I see IO timeouts.

Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback,
which would reap all outstanding commands before the CPU and IRQ are
taken offline. That was removed with commit 4b855ad37194f ("blk-mq:
Create hctx for each present CPU"). It sounds like we should bring
something like that back, but make more fine grain to the per-cpu context.


Seems reasonable. But we would need it to deal with drivers where they only expose a single queue to BLK MQ, but use many queues internally. I think megaraid sas does this, for example.

I would also be slightly concerned with commands being issued from the driver unknown to blk mq, like SCSI TMF.

Thanks,
John

.