On Mon, May 03, 2021 at 06:22:13PM +0800, John Garry wrote:
The tags used for an IO scheduler are currently per hctx.
As such, when q->nr_hw_queues grows, so does the request queue total IO
scheduler tag depth.
This may cause problems for SCSI MQ HBAs whose total driver depth is
fixed.
Ming and Yanhui report higher CPU usage and lower throughput in scenarios
where the fixed total driver tag depth is appreciably lower than the total
scheduler tag depth:
https://lore.kernel.org/linux-block/440dfcfc-1a2c-bd98-1161-cec4d78c6dfc@xxxxxxxxxx/T/#mc0d6d4f95275a2743d1c8c3e4dc9ff6c9aa3a76b
No difference any more wrt. fio running on scsi_debug with this patch in
Yanhui's test machine:
modprobe scsi_debug host_max_queue=128 submit_queues=32 virtual_gb=256 delay=1
vs.
modprobe scsi_debug max_queue=128 submit_queues=1 virtual_gb=256 delay=1
Without this patch, the latter's result is 30% higher than the former's.
note: scsi_debug's queue depth needs to be updated to 128 for avoiding io hang,
which is another scsi issue.