Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

From: Bart Van Assche
Date: Thu Jan 18 2018 - 12:21:09 EST

Next message: Todd Brandt: "Re: [BUG] function_graph trace causes hang when using sleepgraph (4.15.0-rc1 and newer)"
Previous message: Enric Balletbo Serra: "Re: [PATCH 1/3] dt-bindings: phy: phy-rockchip-typec: add usb3 otg reset"
In reply to: Mike Snitzer: "Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle"
Next in thread: Mike Snitzer: "Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, 2018-01-18 at 12:03 -0500, Mike Snitzer wrote:
> On Thu, Jan 18 2018 at 11:50am -0500,
> Bart Van Assche <bart.vanassche@xxxxxxx> wrote:
> > My comments about the above are as follows:
> > - It can take up to q->rq_timeout jiffies after a .queue_rq()
> > implementation returned BLK_STS_RESOURCE before blk_mq_timeout_work()
> > gets called. However, it can happen that only a few milliseconds after
> > .queue_rq() returned BLK_STS_RESOURCE that the condition that caused
> > it to return BLK_STS_RESOURCE gets cleared. So the above approach can
> > result in long delays during which it will seem like the queue got
> > stuck. Additionally, I think that the block driver should decide how
> > long it takes before a queue is rerun and not the block layer core.
>
> So configure q->rq_timeout to be shorter? Which is configurable though
> blk_mq_tag_set's 'timeout' member. It apparently defaults to 30 * HZ.
>
> That is the problem with timeouts, there is generally no one size fits
> all.

Sorry but I think that would be wrong. The delay after which a queue is rerun
should not be coupled to the request timeout. These two should be independent.

> > - The lockup that I reported only occurs with the dm driver but not any
> > other block driver. So why to modify the block layer core since this
> > can be fixed by modifying the dm driver?
>
> Hard to know it is only DM's blk-mq that is impacted. That is the only
> blk-mq driver that you're testing like this (that is also able to handle
> faults, etc).

That's not correct. I'm also testing the SCSI core, which is one of the most
complicated block drivers.

> > - A much simpler fix and a fix that is known to work exists, namely
> > inserting a blk_mq_delay_run_hw_queue() call in the dm driver.
>
> Because your "much simpler" fix actively hurts performance, as is
> detailed in this header:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16&id=ec3eaf9a673106f66606896aed6ddd20180b02ec

We are close to the start of the merge window so I think it's better to fall
back to an old approach that is known to work than to keep a new approach
that is known not to work. Additionally, the performance issue you referred
to only affects IOPS and bandwidth more than 1% with the lpfc driver and that
is because the queue depth it supports is much lower than for other SCSI HBAs,
namely 3 instead of 64.

Thanks,

Bart.

Next message: Todd Brandt: "Re: [BUG] function_graph trace causes hang when using sleepgraph (4.15.0-rc1 and newer)"
Previous message: Enric Balletbo Serra: "Re: [PATCH 1/3] dt-bindings: phy: phy-rockchip-typec: add usb3 otg reset"
In reply to: Mike Snitzer: "Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle"
Next in thread: Mike Snitzer: "Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]