Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable
From: YangYang
Date: Tue Dec 02 2025 - 05:34:06 EST
On 2025/12/2 2:55, Rafael J. Wysocki wrote:
On Mon, Dec 1, 2025 at 1:56 PM YangYang <yang.yang@xxxxxxxx> wrote:
On 2025/12/1 17:46, YangYang wrote:
On 2025/11/27 20:34, Rafael J. Wysocki wrote:
On Wed, Nov 26, 2025 at 11:47 PM Bart Van Assche <bvanassche@xxxxxxx> wrote:
On 11/26/25 1:30 PM, Rafael J. Wysocki wrote:
On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@xxxxxxx> wrote:
On 11/26/25 12:17 PM, Rafael J. Wysocki wrote:
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue
if (flags & BLK_MQ_REQ_NOWAIT)
return -EAGAIN;
+ /* if necessary, resume .dev (assume success). */
+ blk_pm_resume_queue(pm, q);
/*
* read pair of barrier in blk_freeze_queue_start(), we need to
* order reading __PERCPU_REF_DEAD flag of .q_usage_counter and
blk_queue_enter() may be called from the suspend path so I don't think
that the above change will work.
Why would the existing code work then?
The existing code works reliably on a very large number of devices.
Well, except that it doesn't work during system suspend and
hibernation when the PM workqueue is frozen. I think that we agree
here.
This needs to be addressed because it may very well cause system
suspend to deadlock.
There are two possible ways to address it I can think of:
1. Changing blk_pm_resume_queue() and its users to carry out a
synchronous resume of q->dev instead of calling pm_request_resume()
and (effectively) waiting for the queued-up runtime resume of q->dev
to take effect.
This would be my preferred option, but at this point I'm not sure if
it's viable.
After __pm_runtime_disable() is called from device_suspend_late(), dev->power.disable_depth is set, preventing
rpm_resume() from making progress until the system resume completes, regardless of whether rpm_resume() is invoked
synchronously or asynchronously.
Performing a synchronous resume of q->dev seems to have a similar effect to removing the following code block from
__pm_runtime_barrier(), which is invoked by __pm_runtime_disable():
1428 if (dev->power.request_pending) {
1429 dev->power.request = RPM_REQ_NONE;
1430 spin_unlock_irq(&dev->power.lock);
1431
1432 cancel_work_sync(&dev->power.work);
1433
1434 spin_lock_irq(&dev->power.lock);
1435 dev->power.request_pending = false;
1436 }
Since both synchronous and asynchronous resumes face similar issues,
No, they don't.
it may be sufficient to keep using the asynchronous resume path as long as
pending work items are not canceled while the PM workqueue is frozen.
Except for two things:
1. If blk_queue_enter() or __bio_queue_enter() is allowed to race with
disabling runtime PM, queuing up the resume work item may fail in the
first place.
Perhaps my understanding is incorrect, but during the execution of
device_suspend_late(), the PM workqueue should already be frozen.
In that case, queuing a resume work item would not fail; it would
simply not be executed until the workqueue is unfrozen, as long as
it is not canceled.
2. If a device runtime resume work item is queued up before the whole
system is suspended, it may not make sense to run that work item after
resuming the whole system because the state of the system as a whole
is generally different at that point.
This allows the pending work to proceed normally once the PM workqueue
is unfrozen.
Not really.