Re: [PATCH v4 12/13] blk-mq: Use shared tags for shared sbitmap support

From: John Garry
Date: Fri Sep 24 2021 - 06:36:22 EST

Next message: Frederic Weisbecker: "Re: [Issue] timer callback registered with mod_timer is getting called beforetime"
Previous message: Pavel Machek: "Re: [patch 136/147] nilfs2: use refcount_dec_and_lock() to fix potential UAF"
In reply to: Hannes Reinecke: "Re: [PATCH v4 12/13] blk-mq: Use shared tags for shared sbitmap support"
Next in thread: John Garry: "Re: [PATCH v4 12/13] blk-mq: Use shared tags for shared sbitmap support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

+ Kashyap

On 24/09/2021 11:23, Hannes Reinecke wrote:

On 9/24/21 10:28 AM, John Garry wrote:

Currently we use separate sbitmap pairs and active_queues atomic_t for
shared sbitmap support.

However a full sets of static requests are used per HW queue, which is
quite wasteful, considering that the total number of requests usable at
any given time across all HW queues is limited by the shared sbitmap depth.

As such, it is considerably more memory efficient in the case of shared
sbitmap to allocate a set of static rqs per tag set or request queue, and
not per HW queue.

So replace the sbitmap pairs and active_queues atomic_t with a shared
tags per tagset and request queue, which will hold a set of shared static
rqs.

Since there is now no valid HW queue index to be passed to the blk_mq_ops
.init and .exit_request callbacks, pass an invalid index token. This
changes the semantics of the APIs, such that the callback would need to
validate the HW queue index before using it. Currently no user of shared
sbitmap actually uses the HW queue index (as would be expected).

Continue to use term "shared sbitmap" for now, as the meaning is known.

Signed-off-by: John Garry <john.garry@xxxxxxxxxx>
---
block/blk-mq-sched.c   | 82 ++++++++++++++++++-------------------
block/blk-mq-tag.c     | 61 ++++++++++------------------
block/blk-mq-tag.h     | 6 +--
block/blk-mq.c         | 91 +++++++++++++++++++++++-------------------
block/blk-mq.h         | 5 ++-
include/linux/blk-mq.h | 15 ++++---
include/linux/blkdev.h | 3 +-
7 files changed, 125 insertions(+), 138 deletions(-)

The overall idea to keep the full request allocation per queue was to ensure memory locality for the requests themselves.
When moving to a shared request structure we obviously loose that feature.

But I'm not sure if that matters here; the performance impact might be too small to be measurable, seeing that we'll be most likely bound by hardware latencies anyway.

Nevertheless: have you tested for performance regressions with this patchset?

I have tested relatively lower rates, like ~450K IOPS, without any noticeable regression.

I'm especially thinking of Kashyaps high-IOPS megaraid setup; if there is a performance impact that'll be likely scenario where we can measure it.

I can test higher rates, like 2M IOPS, when I get access to the HW.

@Kashyap, Any chance you can help test performance here?

But even if there is a performance impact this patchset might be worthwhile, seeing that it'll reduce the memory footprint massively.

Sure, I don't think that minor performance improvements can justify the excessive memory.

Thanks,
John

Next message: Frederic Weisbecker: "Re: [Issue] timer callback registered with mod_timer is getting called beforetime"
Previous message: Pavel Machek: "Re: [patch 136/147] nilfs2: use refcount_dec_and_lock() to fix potential UAF"
In reply to: Hannes Reinecke: "Re: [PATCH v4 12/13] blk-mq: Use shared tags for shared sbitmap support"
Next in thread: John Garry: "Re: [PATCH v4 12/13] blk-mq: Use shared tags for shared sbitmap support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]