In theory, you still may generate and manage the IPTT in the LLDD by
simply ignoring rq->tag, meantime enabling SCSI_MQ with 16 hw queues.
However, not sure how much this way may improve performance, and it may
degrade IO perf. If 16 hw queues are exposed to blk-mq, 16*.can_queue
requests may be queued to the driver, and allocation & free on the single
IPTT pool will become a bottleneck.
Per my experiment on host tagset, it might be a good tradeoff to allocate
one hw queue for each node to avoid the remote access on dispatch
data/requests structure for this case, but your IPTT pool is still
shared all CPUs, maybe you can try the smart sbitmap.
https://www.spinics.net/lists/linux-scsi/msg117920.html
IFF you device needs different tags for different queues it can use
the blk_mq_unique_tag heper to generate unique global tag.
So this helper can't help, as fundamentially the issue is "the tag field in
struct request is unique per hardware queue but not all all hw queues".
Indeed blk_mq_unique_tag() does give a unique global tag, but cannot be used
for the IPTT.
OTOH, We could expose 16 queues to upper layer, and drop 6/7, but we found
it performs worse.
We discussed this issue before, but not found a good solution yet for
exposing multiple hw queues to blk-mq.
I just think that it's unfortunate that enabling blk-mq means that the LLDD
loses this unique tag across all queues in range [0, Scsi_host.can_queue),
so much so that we found performance better by not exposing multiple queues
and continuing to use single rq tag...
It isn't a new problem, we discussed it a lot on megaraid_sas which has
same situation with yours, you may find it in block list.
Kashyap Desai did lots of test on this case.
However, we still get good performance in case of none scheduler by the
following patches:
8824f62246be blk-mq: fail the request in case issue failure
6ce3dd6eec11 blk-mq: issue directly if hw queue isn't busy in case of 'none'
I think that these patches would have been included in our testing. I need
to check.
Please switch to none io sched in your test, and it is observed that IO
perf becomes good on megaraid_sas.
Thanks,
Ming
.