Re: [PATCH -next V5] blk-mq: fix tag_get wait task can't be awakened

From: Guenter Roeck
Date: Thu Jan 27 2022 - 13:05:00 EST


On 1/27/22 09:28, Jens Axboe wrote:
On 1/26/22 6:32 PM, Guenter Roeck wrote:
Hi,

On Thu, Jan 13, 2022 at 10:55:36AM +0800, Laibin Qiu wrote:
In case of shared tags, there might be more than one hctx which
allocates from the same tags, and each hctx is limited to allocate at
most:
hctx_max_depth = max((bt->sb.depth + users - 1) / users, 4U);

tag idle detection is lazy, and may be delayed for 30sec, so there
could be just one real active hctx(queue) but all others are actually
idle and still accounted as active because of the lazy idle detection.
Then if wake_batch is > hctx_max_depth, driver tag allocation may wait
forever on this real active hctx.

Fix this by recalculating wake_batch when inc or dec active_queues.

Fixes: 0d2602ca30e41 ("blk-mq: improve support for shared tags maps")
Suggested-by: Ming Lei <ming.lei@xxxxxxxxxx>
Suggested-by: John Garry <john.garry@xxxxxxxxxx>
Signed-off-by: Laibin Qiu <qiulaibin@xxxxxxxxxx>

I understand this problem has been reported already, but still:

This patch causes a hang in several of my qemu emulations when
trying to boot from usb. Reverting it fixes the problem. Bisect log
is attached.

Boot logs are available at
https://kerneltests.org/builders/qemu-arm-aspeed-master/builds/230/steps/qemubuildcommand/logs/stdio
but don't really show much: the affected tests simply hang until they
are aborted.

This one got reported a few days ago, can you check if applying:

https://git.kernel.dk/cgit/linux-block/commit/?h=block-5.17&id=10825410b956dc1ed8c5fbc8bbedaffdadde7f20

fixes it for you?

Yes, it does.

Thanks,
Guenter