Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism

From: Bart Van Assche
Date: Thu Sep 24 2015 - 13:36:04 EST


On 09/24/2015 09:54 AM, Tejun Heo wrote:
On Thu, Sep 24, 2015 at 09:43:48AM -0700, Bart Van Assche wrote:
On 09/23/2015 08:23 PM, Ming Lei wrote:
IMO, mq_freeze_depth should only be accessed in slow path, and looks
the race just happens during the small window between increasing
'mq_freeze_depth' and killing the percpu counter.

Hello Ming,

My concern is that *not* checking mq_freeze_depth in the hot path can cause
a livelock. If there is a software layer, e.g. multipathd, that periodically
submits new commands and if these commands take time to process e.g. because
the transport layer is unavailable, how to guarantee that freezing ever
succeeds without checking mq_freeze_depth in the hot path ?

I couldn't tell what the patch was trying to do from the patch
description, so including the above prolly is a good idea. Isn't the
above guaranteed by percpu_ref_kill() preventing new tryget_live()'s?

My interpretation of the percpu_ref_tryget_live() implementation in <linux/percpu-refcount.h> is that the tryget operation will only fail if the refcount is in atomic mode and additionally the __PERCPU_REF_DEAD flag has been set.

Also, what does the barriers do in your patch?

My intention was to guarantee that on architectures that do not provide the same ordering guarantees as x86 (e.g. PPC or ARM) that the store and load operations on mq_freeze_depth and mq_usage_counter would not be reordered. However, it is probably safe to leave out the barrier I proposed to introduce in blk_mq_queue_enter() since it is acceptable that there is some delay in communicating mq_freeze_depth updates from the CPU that modified that counter to the CPU that reads that counter.

The only race condition that I can see there is if unfreeze and freeze
race each other and freeze tries to kill the ref which hasn't finished
reinit yet. We prolly want to put mutexes around freeze/unfreeze so
that they're serialized if something like that can happen (it isn't a
hot path to begin with).

My concern is that the following could happen if mq_freeze_depth is not checked in the hot path of blk_mq_queue_enter():
* mq_usage_counter >= 1 before blk_mq_freeze_queue() is called.
* blk_mq_freeze_queue() keeps waiting forever if new requests are queued
faster than that these requests complete.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/