Re: [PATCH][V3] nbd: add multi-connection support

From: Sagi Grimberg
Date: Tue Oct 11 2016 - 05:00:29 EST



NBD can become contended on its single connection. We have to serialize all
writes and we can only process one read response at a time. Fix this by
allowing userspace to provide multiple connections to a single nbd device. This
coupled with block-mq drastically increases performance in multi-process cases.
Thanks,

Hey Josef,

I gave this patch a tryout and I'm getting a kernel paging request when
running multi-threaded write workload [1].

I have 2 VMs on my laptop: each is assigned with 2 cpus. I connected
the client to the server via 2 connections and ran:
fio --group_reporting --rw=randwrite --bs=4k --numjobs=2 --iodepth=128 --runtime=60 --time_based --loops=1 --ioengine=libaio --direct=1 --invalidate=1 --randrepeat=1 --norandommap --exitall --name task_nbd0 --filename=/dev/nbd0

The server backend is null_blk btw:
./nbd-server 1022 /dev/nullb0

nbd-client:
./nbd-client -C 2 192.168.100.3 1022 /dev/nbd0

[1]:
[ 171.813649] BUG: unable to handle kernel paging request at 0000000235363130
[ 171.816015] IP: [<ffffffffc0645e39>] nbd_queue_rq+0x319/0x580 [nbd]
[ 171.816015] PGD 7a080067 PUD 0
[ 171.816015] Oops: 0000 [#1] SMP
[ 171.816015] Modules linked in: nbd(O) rpcsec_gss_krb5 nfsv4 ib_iser iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi snd_hda_codec_generic ppdev kvm_intel cirrus snd_hda_intel ttm kvm irqbypass drm_kms_helper snd_hda_codec drm snd_hda_core snd_hwdep joydev input_leds fb_sys_fops snd_pcm serio_raw syscopyarea snd_timer sysfillrect snd sysimgblt soundcore i2c_piix4 nfsd ib_umad parport_pc auth_rpcgss nfs_acl rdma_ucm nfs rdma_cm iw_cm lockd grace ib_cm configfs sunrpc ib_uverbs mac_hid fscache ib_core lp parport psmouse floppy e1000 pata_acpi [last unloaded: nbd]
[ 171.816015] CPU: 0 PID: 196 Comm: kworker/0:1H Tainted: G O 4.8.0-rc4+ #61
[ 171.816015] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 171.816015] Workqueue: kblockd blk_mq_run_work_fn
[ 171.816015] task: ffff8f0b37b23280 task.stack: ffff8f0b37bf0000
[ 171.816015] RIP: 0010:[<ffffffffc0645e39>] [<ffffffffc0645e39>] nbd_queue_rq+0x319/0x580 [nbd]
[ 171.816015] RSP: 0018:ffff8f0b37bf3c20 EFLAGS: 00010206
[ 171.816015] RAX: 0000000235363130 RBX: 0000000000000000 RCX: 0000000000000200
[ 171.816015] RDX: 0000000000000200 RSI: ffff8f0b37b23b48 RDI: ffff8f0b37b23280
[ 171.816015] RBP: ffff8f0b37bf3cc8 R08: 0000000000000001 R09: 0000000000000000
[ 171.816015] R10: 0000000000000000 R11: ffff8f0b37f21000 R12: 0000000023536303
[ 171.816015] R13: 0000000000000000 R14: 0000000023536313 R15: ffff8f0b37f21000
[ 171.816015] FS: 0000000000000000(0000) GS:ffff8f0b3d200000(0000) knlGS:0000000000000000
[ 171.816015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 171.816015] CR2: 0000000235363130 CR3: 00000000789b7000 CR4: 00000000000006f0
[ 171.816015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 171.816015] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 171.816015] Stack:
[ 171.816015] ffff8f0b00000000 ffff8f0b37a79480 ffff8f0b378513c8 0000000000000282
[ 171.816015] ffff8f0b37b28428 ffff8f0b37a795f0 ffff8f0b37f21500 00000a0023536313
[ 171.816015] ffffea0001c69080 0000000000000000 ffff8f0b37b28280 1395602537b23280
[ 171.816015] Call Trace:
[ 171.816015] [<ffffffffb8426840>] __blk_mq_run_hw_queue+0x260/0x390
[ 171.816015] [<ffffffffb84269b2>] blk_mq_run_work_fn+0x12/0x20
[ 171.816015] [<ffffffffb80aae21>] process_one_work+0x1f1/0x6b0
[ 171.816015] [<ffffffffb80aada2>] ? process_one_work+0x172/0x6b0
[ 171.816015] [<ffffffffb80ab32e>] worker_thread+0x4e/0x490
[ 171.816015] [<ffffffffb80ab2e0>] ? process_one_work+0x6b0/0x6b0
[ 171.816015] [<ffffffffb80ab2e0>] ? process_one_work+0x6b0/0x6b0
[ 171.816015] [<ffffffffb80b1f41>] kthread+0x101/0x120
[ 171.816015] [<ffffffffb88d4ecf>] ret_from_fork+0x1f/0x40
[ 171.816015] [<ffffffffb80b1e40>] ? kthread_create_on_node+0x250/0x250