Re: [PATCH RFC 3/5] blk-mq: Facilitate a shared tags per tagset

From: John Garry
Date: Fri Nov 15 2019 - 05:46:59 EST


On 15/11/2019 07:26, Hannes Reinecke wrote:
On 11/14/19 10:41 AM, John Garry wrote:
On 13/11/2019 18:38, Hannes Reinecke wrote:
Hi Hannes,

Oh, my. Indeed, that's correct.

The tags could be kept in sync like this:

shared_tag = blk_mq_get_tag(shared_tagset);
if (shared_tag != -1)
ÂÂÂÂÂsbitmap_set(hctx->tags, shared_tag);

But that's obviously not ideal.

Actually, I _do_ prefer keeping both in sync.
We might want to check if the 'normal' tag is set (typically it would
not, but then, who knows ...)
The beauty here is that both 'shared' and 'normal' tag are in sync, so
if a driver would be wanting to use the tag as index into a command
array it can do so without any surprises.

Why do you think it's not ideal?

A few points:
- Getting a bit from one tagset and then setting it in another tagset is
a bit clunky.
Yes, that's true.
But painstakingly trying to find a free bit in a bitmask when we already
know which to pick is also a bit daft.

Yeah, but it's not all good - there would still be a certain overhead in the atomic set and unset bit on the hctx sbitmap. However it still should be faster as it excludes the search.


- There may be an atomicity of the getting the shared tag bit and
setting the hctx tag bit - I don't think that there is.

That was precisely what I've alluded to in 'We might want to check if
the normal tag is set'.
Typically the 'normal' tag would be free (as the shared tag set out of
necessity needs to be the combination of all hctx tag sets).

Right

Any difference here _is_ a programming error, and should be flagged as
such (sbitmap_test_and_set() anyone?)

Eh, I hope that we wouldn't need this.

We might have ordering issues on release, as we really should drop the
hctx tag before the shared tag; but when we observe that we should be fine.

- Consider that sometimes we may want to check if there is space on a hw
queue - checking the hctx tags is not really proper any longer, as
typically there would always be space on hctx, but not always the shared
tags. We did delete blk_mq_can_queue() yesterday, which would be an
example of that. Need to check if there are others.

Clearly, this needs an audit of all functions accessing the hctx tag
space; maybe it's worth having a pre-requisite patchset differentiating
between hctx tags and global, shared tags. Hmm.

Having said all that, the obvious advantage is performance gain, can
still use request.tag and so maybe less intrusive changes.

I'll have a look at the implementation. The devil is mostly in the
detail...

True.
And, incidentally, if we run with shared tage we can skip the scheduling
section in blk_mq_get_tag(); if we're out of tags, we're out of tags,

Right, but don't we need to then "kick all hw queues", instead of just that for the current hctx in blk_mq_get_tag() when free tags are exhausted?

and no rescheduling will help as we don't _have_ other tagsets to look at.

But overall I like this approach.


Yeah, to me it seems sensible. Again, a neat implementation is the challenge.

I'll post an RFC v2 for this one.

I am also requesting some performance figures also internally.

Thanks,
John