Re: [PATCH] scsi: sd: add runtime pm to open / release

From: Alan Stern
Date: Mon Jun 29 2020 - 17:34:10 EST


On Mon, Jun 29, 2020 at 09:56:49AM -0700, Bart Van Assche wrote:
> On 2020-06-29 09:15, Alan Stern wrote:
> > Aha. Looking at this more closely, it's apparent that the code in
> > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT
> > flag is set then the request can be issued regardless of the queue's
> > runtime status. That is not correct when the queue is suspended.
>
> Please clarify why this is not correct.

As I understand it, BLK_MQ_REQ_PREEMPT is supposed to mean (among other
things) that this request may be issued as part of the procedure for
putting a device into a low-power state or returning it to a high-power
state. Consequently, requests with that flag set must be allowed while
the queue is in the RPM_SUSPENDING or RPM_RESUMING runtime states -- as
opposed to ordinary requests, which are allowed only in the RPM_ACTIVE
state.

In the RPM_SUSPENDED state, however, the queue is entirely inactive. Even
if a request were to be issued somehow, it would fail because the system
would not be able to transmit it to the device. In other words, when the
queue is in the RPM_SUSPENDED state, a resume must be requested before
_any_ request can be issued.

> > Index: usb-devel/block/blk-core.c
> > ===================================================================
> > --- usb-devel.orig/block/blk-core.c
> > +++ usb-devel/block/blk-core.c
> > @@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue
> > * responsible for ensuring that that counter is
> > * globally visible before the queue is unfrozen.
> > */
> > - if (pm || !blk_queue_pm_only(q)) {
> > + if ((pm && q->rpm_status != RPM_SUSPENDED) ||
> > + !blk_queue_pm_only(q)) {
> > success = true;
> > } else {
> > percpu_ref_put(&q->q_usage_counter);
>
> Does the above change make it impossible to bring a suspended device
> back to the RPM_ACTIVE state if the BLK_MQ_REQ_NOWAIT flag is set?

The only case affected by this change is when BLK_MQ_REQ_PREEMPT is set
and the queue is in the RPM_SUSPENDED state. If BLK_MQ_REQ_NOWAIT was
also set, the original code would set "success" to true, allowing the
request to proceed even though it could not be carried out immediately --
a bug.

With the patch, such a request will fail without resuming the device. I
don't know whether that is the desired behavior or not, but at least it's
not obviously a bug.

It does seem odd that blk_queue_enter() tests the queue's pm_only status
and the request flag in two different spots (here and below). Why does it
do this? It seems like an invitation for bugs.

> > @@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue
> >
> > wait_event(q->mq_freeze_wq,
> > (!q->mq_freeze_depth &&
> > - (pm || (blk_pm_request_resume(q),
> > - !blk_queue_pm_only(q)))) ||
> > + blk_pm_resume_queue(pm, q)) ||
> > blk_queue_dying(q));
> > if (blk_queue_dying(q))
> > return -ENODEV;
> > Index: usb-devel/block/blk-pm.h
> > ===================================================================
> > --- usb-devel.orig/block/blk-pm.h
> > +++ usb-devel/block/blk-pm.h
> > @@ -6,11 +6,14 @@
> > #include <linux/pm_runtime.h>
> >
> > #ifdef CONFIG_PM
> > -static inline void blk_pm_request_resume(struct request_queue *q)
> > +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q)
> > {
> > - if (q->dev && (q->rpm_status == RPM_SUSPENDED ||
> > - q->rpm_status == RPM_SUSPENDING))
> > - pm_request_resume(q->dev);
> > + if (!q->dev || !blk_queue_pm_only(q))
> > + return 1; /* Nothing to do */
> > + if (pm && q->rpm_status != RPM_SUSPENDED)
> > + return 1; /* Request allowed */
> > + pm_request_resume(q->dev);
> > + return 0;
> > }
>
> Does the above change, especially the " && q->rpm_status !=
> RPM_SUSPENDED" part, make it impossible to bring a suspended device back
> to the RPM_ACTIVE state?

Just the opposite -- the change makes it _possible_ for a
BLK_MQ_REQ_PREEMPT request to bring a suspended device back to the
RPM_ACTIVE state.

Look at the existing code: If pm is true then blk_pm_request_resume() will
be skipped, so the device won't be resumed. With this patch -- in
particular with the "&& q->rpm_status != RPM_SUSPENDED" part added -- the
call won't be skipped and so the resume will take place.

The rather complicated syntax of the wait_event() call in the existing
code contributes to this confusion. One of the things my patch tries to
do is make the code more straightforward and easier to grasp.

I admit that there are parts to this thing I don't understand. The
wait_event() call in blk_queue_enter(), for example: If we are waiting for
the device to leave the RPM_SUSPENDED state (or enter the RPM_ACTIVE
state), where does q->mq_freeze_wq get woken up? There's no obvious spot
in blk_{pre|post}_runtime_resume().

Alan Stern