RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

From: KY Srinivasan
Date: Thu Jul 17 2014 - 19:54:04 EST




> -----Original Message-----
> From: driverdev-devel-bounces@xxxxxxxxxxxxxxxxxxxxxx [mailto:driverdev-
> devel-bounces@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of KY Srinivasan
> Sent: Friday, June 20, 2014 2:37 PM
> To: Jens Axboe; James Bottomley; michaelc@xxxxxxxxxxx
> Cc: linux-scsi@xxxxxxxxxxxxxxx; gregkh@xxxxxxxxxxxxxxxxxxx;
> jasowang@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; ohering@xxxxxxxx;
> hch@xxxxxxxxxxxxx; apw@xxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx
> Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT
> from the basic I/O timeout
>
>
>
> > -----Original Message-----
> > From: Jens Axboe [mailto:axboe@xxxxxxxxx]
> > Sent: Friday, June 6, 2014 11:23 AM
> > To: James Bottomley; michaelc@xxxxxxxxxxx
> > Cc: linux-kernel@xxxxxxxxxxxxxxx; hch@xxxxxxxxxxxxx;
> > devel@xxxxxxxxxxxxxxxxxxxxxx; apw@xxxxxxxxxxxxx; KY Srinivasan; linux-
> > scsi@xxxxxxxxxxxxxxx; ohering@xxxxxxxx; gregkh@xxxxxxxxxxxxxxxxxxx;
> > jasowang@xxxxxxxxxx
> > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> > FLUSH_TIMEOUT from the basic I/O timeout
> >
> > On 2014-06-06 11:52, James Bottomley wrote:
> > > On Fri, 2014-06-06 at 12:18 -0500, Mike Christie wrote:
> > >> On 6/5/14, 9:53 PM, KY Srinivasan wrote:
> > >>>
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Mike Christie [mailto:michaelc@xxxxxxxxxxx]
> > >>>> Sent: Thursday, June 5, 2014 6:33 PM
> > >>>> To: KY Srinivasan
> > >>>> Cc: James Bottomley; linux-kernel@xxxxxxxxxxxxxxx;
> > >>>> apw@xxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx;
> > hch@xxxxxxxxxxxxx;
> > >>>> linux- scsi@xxxxxxxxxxxxxxx; ohering@xxxxxxxx;
> > >>>> gregkh@xxxxxxxxxxxxxxxxxxx; jasowang@xxxxxxxxxx
> > >>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> > >>>> FLUSH_TIMEOUT from the basic I/O timeout
> > >>>>
> > >>>> On 06/04/2014 12:15 PM, KY Srinivasan wrote:
> > >>>>>
> > >>>>>
> > >>>>>> -----Original Message-----
> > >>>>>> From: James Bottomley [mailto:jbottomley@xxxxxxxxxxxxx]
> > >>>>>> Sent: Wednesday, June 4, 2014 10:02 AM
> > >>>>>> To: KY Srinivasan
> > >>>>>> Cc: linux-kernel@xxxxxxxxxxxxxxx; apw@xxxxxxxxxxxxx;
> > >>>>>> devel@xxxxxxxxxxxxxxxxxxxxxx; hch@xxxxxxxxxxxxx; linux-
> > >>>>>> scsi@xxxxxxxxxxxxxxx; ohering@xxxxxxxx;
> > >>>>>> gregkh@xxxxxxxxxxxxxxxxxxx; jasowang@xxxxxxxxxx
> > >>>>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the
> > >>>>>> FLUSH_TIMEOUT from the basic I/O timeout
> > >>>>>>
> > >>>>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote:
> > >>>>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added
> > code
> > >>>>>>> to derive the FLUSH_TIMEOUT from the basic I/O timeout.
> > However,
> > >>>>>>> this patch did not use the basic I/O timeout of the device.
> > >>>>>>> Fix this
> > bug.
> > >>>>>>>
> > >>>>>>> Signed-off-by: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
> > >>>>>>> ---
> > >>>>>>> drivers/scsi/sd.c | 4 +++-
> > >>>>>>> 1 files changed, 3 insertions(+), 1 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index
> > >>>>>>> e9689d5..54150b1 100644
> > >>>>>>> --- a/drivers/scsi/sd.c
> > >>>>>>> +++ b/drivers/scsi/sd.c
> > >>>>>>> @@ -832,7 +832,9 @@ static int
> > sd_setup_write_same_cmnd(struct
> > >>>>>>> scsi_device *sdp, struct request *rq)
> > >>>>>>>
> > >>>>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp,
> > >>>>>>> struct request *rq) {
> > >>>>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;
> > >>>>>>> + int timeout = sdp->request_queue->rq_timeout;
> > >>>>>>> +
> > >>>>>>> + rq->timeout = (timeout *
> > SD_FLUSH_TIMEOUT_MULTIPLIER);
> > >>>>>>
> > >>>>>> Could you share where you found this to be a problem? It looks
> > >>>>>> like a bug in block because all inbound requests being prepared
> > >>>>>> should have a timeout set, so block would be the place to fix it.
> > >>>>>
> > >>>>> Perhaps; what I found was that the value in rq->timeout was 0
> > >>>>> coming into this function and thus multiplying obviously has no
> effect.
> > >>>>>
> > >>>>
> > >>>> I think you are right. We hit this problem because we are doing:
> > >>>>
> > >>>> scsi_request_fn -> blk_peek_request -> sd_prep_fn ->
> > >>>> scsi_setup_flush_cmnd.
> > >>>>
> > >>>> At this time request->timeout is zero so the multiplication does
> > >>>> nothing. See how sd_setup_write_same_cmnd will set the request-
> > >timeout at this time.
> > >>>>
> > >>>> Then in scsi_request_fn we do:
> > >>>>
> > >>>> scsi_request_fn -> blk_start_request -> blk_add_timer.
> > >>>>
> > >>>> At this time it will set the request->timeout if something like
> > >>>> req block pc users (like scsi_execute() or block/scsi_ioctl.c) or
> > >>>> the write same code mentioned above have not set the timeout.
> > >>>
> > >>> I don't think this is a recent change. Prior to this commit, we
> > >>> were setting the timeout value in this function; it just happened
> > >>> to be a different constant unrelated to the I/O timeout.
> > >>>
> > >>
> > >> Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd
> > >> was merged we were supposed to initialize it like in your patch in this
> thread.
> > >>
> > >> I guess we could do your patch in this thread, or if we want the
> > >> block layer to initialize the timeout before the prep_fn callout is
> > >> called then we would need to have the blk-flush.c code to that when
> > >> it sets up the request. If we do the latter, do we want the discard
> > >> and write same code to initialize the request's timeout before the
> > >> prep_fn callout is called too?
> > >
> > > I looked through the call chain; it seems to be intentional
> > > behaviour on the part of block. Just from an mq point of view, it
> > > would make better code if we unconditionally initialised rq->timeout
> > > early and allowed prep to modify it and then dumped the
> > > if(!req->timeout) in blk_add_timer(), but it's a marginal if
> > > condition that would compile to a conditional store on sensible
> > > architectures, so losing the conditional probably isn't worth worrying
> about.
> > >
> > > Cc'd Jens for his opinion with the block patch
> >
> > I just committed this one earlier today:
> >
> > http://git.kernel.dk/?p=linux-
> > block.git;a=commit;h=f6be4fb4bcb396fc3b1c134b7863351972de081f
> >
> > since I ran into the same thing on nvme. Either approach is fine with
> > me, as they both allow override of the timeout before insertion. But
> > we've always done the rq->timeout = 0 init, so I think we should just
> > reinstate that behavior.
>
> James,
>
> How is this being fixed now.
>
> Regards,
>
> K. Y

I still see this problem. There was talk of fixing it elsewhere.

Regards,

K. Y
>
> _______________________________________________
> devel mailing list
> devel@xxxxxxxxxxxxxxxxxxxxxx
> http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/