Re: [PATCH] blk-mq: Abort suspend when wakeup events are pending

From: Cong Zhang
Date: Tue Dec 02 2025 - 04:52:26 EST




On 12/2/2025 5:20 PM, Ming Lei wrote:
> On Tue, Dec 02, 2025 at 11:56:12AM +0800, Cong Zhang wrote:
>> During system suspend, wakeup capable IRQs for block device can be
>> delayed, which can cause blk_mq_hctx_notify_offline() to hang
>> indefinitely while waiting for pending request to complete.
>> Skip the request waiting loop and abort suspend when wakeup events are
>> pending to prevent the deadlock.
>>
>> Fixes: bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline")
>> Signed-off-by: Cong Zhang <cong.zhang@xxxxxxxxxxxxxxxx>
>> ---
>> The issue was found during system suspend with a no_soft_reset
>> virtio-blk device. Here is the detailed analysis:
>> - When system suspend starts and no_soft_reset is enabled, virtio-blk
>> does not call its suspend callback.
>> - Some requests are dispatched and queued. After sending the virtqueue
>> notifier, the kernel waits for an IRQ to complete the request.
>> - The virtio-blk IRQ is wakeup-capable. When the IRQ is triggered, it
>> remains pending because the device is in the suspend process.
>
> Can you explain a bit for above point? Why does the IRQ remains pending
> and not get handled?
>

The wakeup capable IRQ is not masked during suspend. When the IRQ is
triggered, the kernel does not call its IRQ handler, instead kernel only
marks the IRQ as a wakeup event in pm_system_irq_wakeup(). By checking
pm_wakeup_pending() suspend process can abort if a wakeup event is
detected. That means the actual IRQ handler is not called during the
checking of blk_mq_hctx_has_requests, which cause the issue.

>
> Thanks,
> Ming
>