Re: [PATCH 0/7] hisi_sas: Misc bugfixes and an optimisation patch

From: Ming Lei
Date: Thu Oct 11 2018 - 09:32:17 EST


On Thu, Oct 11, 2018 at 02:12:11PM +0100, John Garry wrote:
> On 11/10/2018 11:15, Christoph Hellwig wrote:
> > On Thu, Oct 11, 2018 at 10:59:11AM +0100, John Garry wrote:
> > >
> > > > blk-mq tags are always per-host (which has actually caused problems for
> > > > ATA, which is now using its own per-device tags).
> > > >
> > >
> > > So, for example, if Scsi_host.can_queue = 2048 and Scsi_host.nr_hw_queues =
> > > 16, then rq tags are still in range [0, 2048) for that HBA, i.e. invariant
> > > on queue count?
> >
> > Yes, if can_queue is 2048 you will gets tags from 0..2047.
> >
>
> I should be clear about some things before discussing this further. Our
> device has 16 hw queues. And each command we send to any queue in the device
> must have a unique tag across all hw queues for that device, and should be
> in the range [0, 2048) - it's called an IPTT. So Scsi_host.can_queue = 2048.

Could you describe a bit about IPTT?

Looks like the 16 hw queues are like reply queues in other drivers,
such as megara_sas, but given all the 16 reply queues share one tagset,
so the hw queue number has to be 1 from blk-mq's view.

>
> However today we only expose a single queue to upper layer (for unrelated
> LLDD error handling restriction). We hope to expose all 16 queues in future,
> which is what I meant by "enabling SCSI MQ in the driver". However, with
> 6/7, this creates a problem, below.

If the tag of each request from all hw queues has to be unique, you
can't expose all 16 queues.

>
> > IFF you device needs different tags for different queues it can use
> > the blk_mq_unique_tag heper to generate unique global tag.
>
> So this helper can't help, as fundamentially the issue is "the tag field in
> struct request is unique per hardware queue but not all all hw queues".
> Indeed blk_mq_unique_tag() does give a unique global tag, but cannot be used
> for the IPTT.
>
> OTOH, We could expose 16 queues to upper layer, and drop 6/7, but we found
> it performs worse.

We discussed this issue before, but not found a good solution yet for
exposing multiple hw queues to blk-mq.

However, we still get good performance in case of none scheduler by the
following patches:

8824f62246be blk-mq: fail the request in case issue failure
6ce3dd6eec11 blk-mq: issue directly if hw queue isn't busy in case of 'none'


Thanks,
Ming