Getting low IOPS on kernel 3.19.8 with multi-queue enabled

From: Nisha Miller
Date: Thu May 05 2016 - 18:06:22 EST


Hi,

I've bought a Xeon based server and I'm trying to see how many IOPS I
can get out of it using null_blk driver. Eventually I'll install a
NVME SSD card in it.

According to various online documents (see one URL below), I should be
getting upwards of 4M IOPS but I'm getting only 2.2. What am I doing
wrong here?

http://bjorling.me/blkmq-slides.pdf

I'm using Centos 7.2 with kernel 3.19.8. Following flags are enabled.

CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_SCSI_MQ_DEFAULT=y

This is how I load null_blk driver:

insmod null_blk.ko submit_queues=48 queue_mode=2 gb=100 bs=4096
nr_devices=1 irqmode=0 completion_nsec=1 hw_queue_depth=128

I'm using fio for the benchmark and here is how I run it:

./fio --filename=/dev/nullb0 --direct=1 --ioengine=libaio
--rw=randread --bs=4k --size=200m --numjobs=200 --iodepth=64
--runtime=30 --time_based --group_reporting --name=journal-test

The server is Xeon based, here is the output of lscpu:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Stepping: 2
CPU MHz: 1299.281
BogoMIPS: 4805.22
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 15360K
NUMA node0 CPU(s): 0-5,12-17
NUMA node1 CPU(s): 6-11,18-23

Is there anything I can do to improve the performance?

TIA
Nisha