I mentioned in the thread "blk-mq: improvement on handling IO during CPU
hotplug" that I was using this series to test that patchset.
So just with this patchset (and without yours), I get what looks like some
IO errors in the LLDD. The error is an underflow error. I can't figure out
what is the cause.
Can you post the error log? Or interpret the 'underflow error' from hisi
sas or scsi viewpoint?
I'm wondering if the SCSI command is getting corrupted someway.
Why do you think the command is corrupted?
+ if (expose_mq_experimental) {The above is contradictory with current 'nr_hw_queues''s meaning,
+ shost->can_queue = HISI_SAS_MAX_COMMANDS;
+ shost->cmd_per_lun = HISI_SAS_MAX_COMMANDS;
see commit on Scsi_Host.nr_hw_queues.
Right, so I am generating the hostwide tag in the LLDD. And the Scsi
host-wide host_busy counter should ensure that we don't pump too much IO to
the HBA.
Even without the host-wide host_busy, your approach should work if you
build the hisi sas tag correctly(uniquely), just not efficiently.
suggest you to collect trace and observe if request with expected hisi sas
tag is sent to hardware.
BTW, the patch of 'scsi: core: avoid host-wide host_busy counter for scsi_mq'
will be merged to v5.5 if everything is fine.
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=5.5/scsi-queue&id=6eb045e092efefafc6687409a6fa6d1dabf0fb69