Re: [PATCH v4 12/13] blk-mq: Use shared tags for shared sbitmap support

From: Hannes Reinecke
Date: Fri Sep 24 2021 - 06:23:58 EST


On 9/24/21 10:28 AM, John Garry wrote:
Currently we use separate sbitmap pairs and active_queues atomic_t for
shared sbitmap support.

However a full sets of static requests are used per HW queue, which is
quite wasteful, considering that the total number of requests usable at
any given time across all HW queues is limited by the shared sbitmap depth.

As such, it is considerably more memory efficient in the case of shared
sbitmap to allocate a set of static rqs per tag set or request queue, and
not per HW queue.

So replace the sbitmap pairs and active_queues atomic_t with a shared
tags per tagset and request queue, which will hold a set of shared static
rqs.

Since there is now no valid HW queue index to be passed to the blk_mq_ops
.init and .exit_request callbacks, pass an invalid index token. This
changes the semantics of the APIs, such that the callback would need to
validate the HW queue index before using it. Currently no user of shared
sbitmap actually uses the HW queue index (as would be expected).

Continue to use term "shared sbitmap" for now, as the meaning is known.

Signed-off-by: John Garry <john.garry@xxxxxxxxxx>
---
block/blk-mq-sched.c | 82 ++++++++++++++++++-------------------
block/blk-mq-tag.c | 61 ++++++++++------------------
block/blk-mq-tag.h | 6 +--
block/blk-mq.c | 91 +++++++++++++++++++++++-------------------
block/blk-mq.h | 5 ++-
include/linux/blk-mq.h | 15 ++++---
include/linux/blkdev.h | 3 +-
7 files changed, 125 insertions(+), 138 deletions(-)

The overall idea to keep the full request allocation per queue was to ensure memory locality for the requests themselves.
When moving to a shared request structure we obviously loose that feature.

But I'm not sure if that matters here; the performance impact might be too small to be measurable, seeing that we'll be most likely bound by hardware latencies anyway.

Nevertheless: have you tested for performance regressions with this patchset?
I'm especially thinking of Kashyaps high-IOPS megaraid setup; if there is a performance impact that'll be likely scenario where we can measure it.

But even if there is a performance impact this patchset might be worthwhile, seeing that it'll reduce the memory footprint massively.

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer