Re: [PATCH] blk-mq: allow hardware queue to get more tag while sharing a tag set
From: yukuai (C)
Date: Mon Aug 02 2021 - 09:34:33 EST
On 2021/08/01 1:15, Bart Van Assche wrote:
On 7/11/21 8:18 PM, Yu Kuai wrote:
If there are multiple active queues while sharing a tag set, it's not
necessary to limit the available tags as same share for each active queue
if no one ever failed to get driver tag. And fall back to same share if
someone do failed to get driver tag.
This modification will be beneficial if total queue_depth of disks
on the same host is less than total tags.
This patch adds new atomic operations in the hot path and hence probably
has a negative performance impact. What is the performance impact of
this patch for e.g. null_blk when submitting I/O from all CPU cores?
Thanks,
Bart.
.
Hi, Bart
I run a test on both null_blk and nvme, results show that there are no
performance degradation:
test platform: x86
test cpu: 2 nodes, total 72
test scheduler: none
test device: null_blk / nvme
test cmd: fio -filename=/dev/xxx -name=test -ioengine=libaio -direct=1
-numjobs=72 -iodepth=16 -bs=4k -rw=write -offset_increment=1G
-cpus_allowed=0:71 -cpus_allowed_policy=split -group_reporting
-runtime=120
test results: iops
1) null_blk before this patch: 280k
2) null_blk after this patch: 282k
3) nvme before this patch: 378k
4) nvme after this patch: 384k
details:
1) null_blk before this patch:
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=libaio, iodepth=16
...
fio-3.13-42-g8066f
Starting 72 processes
Jobs: 72 (f=72): [W(72)][100.0%][w=1095MiB/s][w=280k IOPS][eta 00m:00s]
test: (groupid=0, jobs=72): err= 0: pid=4986: Mon Aug 2 11:25:33 2021
write: IOPS=279k, BW=1091MiB/s (1144MB/s)(128GiB/120009msec); 0 zone
resets
slat (nsec): min=1069, max=1837.6M, avg=240280.55, stdev=3604257.00
clat (usec): min=89, max=1837.9k, avg=3882.70, stdev=13528.70
lat (usec): min=175, max=1837.9k, avg=4123.03, stdev=13939.66
clat percentiles (usec):
| 1.00th=[ 223], 5.00th=[ 223], 10.00th=[ 225], 20.00th=[
231],
| 30.00th=[ 478], 40.00th=[ 873], 50.00th=[ 1811], 60.00th=[
2737],
| 70.00th=[ 4293], 80.00th=[ 6915], 90.00th=[ 9372], 95.00th=[
12780],
| 99.00th=[ 18482], 99.50th=[ 22676], 99.90th=[ 62129],
99.95th=[231736],
| 99.99th=[641729]
bw ( MiB/s): min= 32, max= 3681, per=100.00%, avg=1106.55,
stdev=25.25, samples=17006
iops : min= 8405, max=942588, avg=283276.25, stdev=6464.60,
samples=17006
lat (usec) : 100=0.01%, 250=24.18%, 500=8.74%, 750=4.72%, 1000=4.01%
lat (msec) : 2=12.28%, 4=12.86%, 10=24.23%, 20=8.06%, 50=0.81%
lat (msec) : 100=0.02%, 250=0.04%, 500=0.03%, 750=0.01%, 1000=0.01%
lat (msec) : 2000=0.01%
cpu : usr=0.35%, sys=0.79%, ctx=35473919, majf=0, minf=1419
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,33525688,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
WRITE: bw=1091MiB/s (1144MB/s), 1091MiB/s-1091MiB/s
(1144MB/s-1144MB/s), io=128GiB (137GB), run=120009-120009msec
Disk stats (read/write):
nullb0: ios=0/33485328, merge=0/0, ticks=0/4817009, in_queue=4817009,
util=100.00%
2) null_blk after this patch:
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=libaio, iodepth=16
...
fio-3.13-42-g8066f
Starting 72 processes
Jobs: 72 (f=72): [W(72)][100.0%][w=1101MiB/s][w=282k IOPS][eta 00m:00s]
test: (groupid=0, jobs=72): err= 0: pid=5001: Mon Aug 2 10:36:52 2021
write: IOPS=281k, BW=1097MiB/s (1150MB/s)(129GiB/120009msec); 0 zone
resets
slat (nsec): min=1104, max=5358.9M, avg=239050.23, stdev=4040598.71
clat (usec): min=2, max=5359.3k, avg=3862.86, stdev=15270.20
lat (usec): min=4, max=5359.3k, avg=4101.96, stdev=15742.32
clat percentiles (usec):
| 1.00th=[ 221], 5.00th=[ 223], 10.00th=[ 225], 20.00th=[
231],
| 30.00th=[ 482], 40.00th=[ 1106], 50.00th=[ 1909], 60.00th=[
3163],
| 70.00th=[ 4490], 80.00th=[ 5538], 90.00th=[ 10683], 95.00th=[
14877],
| 99.00th=[ 16450], 99.50th=[ 19530], 99.90th=[ 30802], 99.95th=[
34341],
| 99.99th=[650118]
bw ( MiB/s): min= 23, max= 4395, per=100.00%, avg=1119.48,
stdev=27.64, samples=16872
iops : min= 5906, max=1125367, avg=286585.88, stdev=7075.29,
samples=16872
lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 250=24.77%
lat (usec) : 500=6.12%, 750=4.51%, 1000=3.97%
lat (msec) : 2=11.02%, 4=15.75%, 10=23.34%, 20=10.15%, 50=0.34%
lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2000=0.01%, >=2000=0.01%
cpu : usr=0.36%, sys=0.79%, ctx=35506798, majf=0, minf=966
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,33697894,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
WRITE: bw=1097MiB/s (1150MB/s), 1097MiB/s-1097MiB/s
(1150MB/s-1150MB/s), io=129GiB (138GB), run=120009-120009msec
Disk stats (read/write):
nullb0: ios=0/33657152, merge=0/0, ticks=0/4812746, in_queue=4812745,
util=100.00%
3) nvme before this patch:
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=libaio, iodepth=16
...
fio-3.13-42-g8066f
Starting 72 processes
Jobs: 72 (f=72): [W(72)][100.0%][w=1478MiB/s][w=378k IOPS][eta 00m:00s]
test: (groupid=0, jobs=72): err= 0: pid=4780: Mon Aug 2 11:22:36 2021
write: IOPS=382k, BW=1491MiB/s (1564MB/s)(175GiB/120113msec); 0 zone
resets
slat (nsec): min=1234, max=328006k, avg=102467.85, stdev=4967629.26
clat (nsec): min=1788, max=329044k, avg=2899631.83, stdev=24819488.97
lat (usec): min=31, max=424004, avg=3004.41, stdev=25334.53
clat percentiles (usec):
| 1.00th=[ 39], 5.00th=[ 39], 10.00th=[ 39], 20.00th=[
39],
| 30.00th=[ 40], 40.00th=[ 40], 50.00th=[ 40], 60.00th=[
40],
| 70.00th=[ 41], 80.00th=[ 41], 90.00th=[ 42], 95.00th=[
45],
| 99.00th=[132645], 99.50th=[252707], 99.90th=[287310],
99.95th=[291505],
| 99.99th=[304088]
bw ( MiB/s): min= 783, max= 2394, per=100.00%, avg=1492.49, stdev=
5.56, samples=17278
iops : min=200590, max=613014, avg=382076.48, stdev=1424.37,
samples=17278
lat (usec) : 2=0.01%, 4=0.01%, 20=0.01%, 50=95.89%, 100=0.06%
lat (usec) : 250=0.06%, 500=0.15%, 750=0.18%, 1000=0.22%
lat (msec) : 2=0.96%, 4=0.60%, 10=0.21%, 20=0.05%, 50=0.18%
lat (msec) : 100=0.27%, 250=0.65%, 500=0.51%
cpu : usr=0.44%, sys=0.94%, ctx=123991, majf=0, minf=988
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,45859799,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
WRITE: bw=1491MiB/s (1564MB/s), 1491MiB/s-1491MiB/s
(1564MB/s-1564MB/s), io=175GiB (188GB), run=120113-120113msec
Disk stats (read/write):
nvme0n1: ios=308/45807739, merge=0/0, ticks=57/2334550,
in_queue=2334607, util=100.00%
4) nvme after this patch:
after: nvme
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=libaio, iodepth=16
...
fio-3.13-42-g8066f
Starting 72 processes
Jobs: 72 (f=72): [W(72)][100.0%][w=1502MiB/s][w=384k IOPS][eta 00m:00s]
test: (groupid=0, jobs=72): err= 0: pid=5320: Mon Aug 2 10:42:07 2021
write: IOPS=383k, BW=1496MiB/s (1569MB/s)(175GiB/120098msec); 0 zone
resets
slat (nsec): min=1229, max=370007k, avg=100549.47, stdev=4919208.81
clat (nsec): min=1634, max=370050k, avg=2892105.62, stdev=24891976.05
lat (usec): min=31, max=380005, avg=2995.16, stdev=25391.59
clat percentiles (usec):
| 1.00th=[ 38], 5.00th=[ 39], 10.00th=[ 39], 20.00th=[
39],
| 30.00th=[ 39], 40.00th=[ 40], 50.00th=[ 40], 60.00th=[
40],
| 70.00th=[ 41], 80.00th=[ 41], 90.00th=[ 42], 95.00th=[
44],
| 99.00th=[135267], 99.50th=[252707], 99.90th=[287310],
99.95th=[291505],
| 99.99th=[304088]
bw ( MiB/s): min= 827, max= 2248, per=100.00%, avg=1496.99, stdev=
5.51, samples=17278
iops : min=211931, max=575591, avg=383228.21, stdev=1411.19,
samples=17278
lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=95.83%
lat (usec) : 100=0.15%, 250=0.05%, 500=0.13%, 750=0.18%, 1000=0.21%
lat (msec) : 2=0.85%, 4=0.84%, 10=0.14%, 20=0.05%, 50=0.14%
lat (msec) : 100=0.25%, 250=0.65%, 500=0.51%
cpu : usr=0.43%, sys=0.95%, ctx=123368, majf=0, minf=989
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,45995620,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=16
Run status group 0 (all jobs):
WRITE: bw=1496MiB/s (1569MB/s), 1496MiB/s-1496MiB/s
(1569MB/s-1569MB/s), io=175GiB (188GB), run=120098-120098msec
Disk stats (read/write):
nvme0n1: ios=190/45976809, merge=0/0, ticks=34/2374865,
in_queue=2374900, util=100.00%
Thanks
Kuai