Re: [PATCH 6/6] blk-mq: remove REQ_ATOM_STARTED

From: Tejun Heo
Date: Tue Dec 12 2017 - 12:01:24 EST


Hello, Jianchao.

On Tue, Dec 12, 2017 at 06:09:32PM +0800, jianchao.wang wrote:
> > @@ -786,18 +779,6 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
> > const struct blk_mq_ops *ops = req->q->mq_ops;
> > enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
> >
> > - /*
> > - * We know that complete is set at this point. If STARTED isn't set
> > - * anymore, then the request isn't active and the "timeout" should
> > - * just be ignored. This can happen due to the bitflag ordering.
> > - * Timeout first checks if STARTED is set, and if it is, assumes
> > - * the request is active. But if we race with completion, then
> > - * both flags will get cleared. So check here again, and ignore
> > - * a timeout event with a request that isn't active.
> > - */
> > - if (!test_bit(REQ_ATOM_STARTED, &req->atomic_flags))
> > - return;
> > -
> > if (ops->timeout)
> > ret = ops->timeout(req, reserved);
>
> The BLK_EH_RESET_TIMER case has not been covered here. In that case,
> the timer will be re-armed, but the gstate and aborted_gstate are
> not updated and still equal with echo other. Consequently, when the
> request is completed later, the __blk_mq_complete_request() will be
> missed, then the request will expire again. The aborted_gstate
> should be updated in the BLK_EH_RESET_TIMER case.

You're right. This is inherently racy tho. Nothing prevented the
command from completing before complete was cleared. I'll just clear
aborted_gstate which should behave the same way.

Thanks.

--
tejun