Re: [PATCH V2 00/16] Introduce the BFQ I/O scheduler

From: Paolo Valente
Date: Tue Apr 11 2017 - 04:43:27 EST



> Il giorno 10 apr 2017, alle ore 18:56, Bart Van Assche <bart.vanassche@xxxxxxxxxxx> ha scritto:
>
> On Fri, 2017-03-31 at 14:47 +0200, Paolo Valente wrote:
>> [ ... ]
>
> Hello Paolo,
>
> Is the git tree that is available at https://github.com/Algodev-github/bfq-mq
> appropriate for testing BFQ? If I merge that tree with v4.11-rc6 and if I run
> the srp-test software against that tree as follows:
>
> ./run_tests -e bfq-mq -t 02-mq
>
> then the following appears on the console:
>
> [ 2748.650352] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
> [ 2748.650442] IP: __bfq_insert_request+0x26/0x650 [bfq_mq_iosched]
> [ 2748.650509] PGD 0
> [ 2748.650511]
> [ 2748.650585] Oops: 0000 [#1] SMP
> [ 2748.651107] CPU: 9 PID: 10772 Comm: kworker/9:2H Tainted: G I 4.11.0-rc6-dbg+ #1
> [ 2748.651191] Workqueue: kblockd blk_mq_requeue_work
> [ 2748.651228] task: ffff88037c808040 task.stack: ffffc90003b4c000
> [ 2748.651268] RIP: 0010:__bfq_insert_request+0x26/0x650 [bfq_mq_iosched]
> [ 2748.651307] RSP: 0018:ffffc90003b4f9d8 EFLAGS: 00010002
> [ 2748.651345] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001
> [ 2748.651383] RDX: 0000000000000001 RSI: ffff880377f52e80 RDI: ffff880401f774e8
> [ 2748.651423] RBP: ffffc90003b4fa80 R08: 9093955f00000000 R09: 0000000000000001
> [ 2748.651464] R10: ffffc90003b4fa00 R11: ffffffffa06d0d53 R12: ffff880401f77840
> [ 2748.651506] R13: ffff880401f774e8 R14: ffff880378a451e0 R15: 0000000000000000
> [ 2748.651547] FS: 0000000000000000(0000) GS:ffff88046f040000(0000) knlGS:0000000000000000
> [ 2748.651588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2748.651626] CR2: 00000000000000d0 CR3: 0000000001c0f000 CR4: 00000000001406e0
> [ 2748.651664] Call Trace:
> [ 2748.651778] bfq_insert_request+0x83/0x280 [bfq_mq_iosched]
> [ 2748.651934] bfq_insert_requests+0x50/0x70 [bfq_mq_iosched]
> [ 2748.651975] blk_mq_sched_insert_request+0x11e/0x170
> [ 2748.652015] blk_insert_cloned_request+0xb6/0x1f0
> [ 2748.652361] map_request+0x13c/0x290 [dm_mod]
> [ 2748.652403] dm_mq_queue_rq+0x90/0x160 [dm_mod]
> [ 2748.652441] blk_mq_dispatch_rq_list+0x1f2/0x3e0
> [ 2748.652479] blk_mq_sched_dispatch_requests+0xf1/0x190
> [ 2748.652516] __blk_mq_run_hw_queue+0x12d/0x1c0
> [ 2748.652553] __blk_mq_delay_run_hw_queue+0xe3/0xf0
> [ 2748.652593] blk_mq_run_hw_queues+0x5c/0x80
> [ 2748.652632] blk_mq_requeue_work+0x132/0x150
> [ 2748.652671] process_one_work+0x206/0x6a0
> [ 2748.652709] worker_thread+0x49/0x4a0
> [ 2748.652745] kthread+0x107/0x140
> [ 2748.652854] ret_from_fork+0x2e/0x40
> [ 2748.652891] Code: ff 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 c4 80 8b 87 58 03 00 00 48 8b 9e b0 00 00 00 85 c0 0f 84 8b 04 00 00 <48> 8b 83 d0 00 00 00 48 85 c0 0f 84 63 04 00 00
> 48 83 e8 10 48
> [ 2748.653049] RIP: __bfq_insert_request+0x26/0x650 [bfq_mq_iosched] RSP: ffffc90003b4f9d8
> [ 2748.653090] CR2: 00000000000000d0
>
> The crash address corresponds to the following source code according to gdb:
>
> (gdb) list *(__bfq_insert_request+0x26)
> 0xd6f6 is in __bfq_insert_request (block/bfq-mq-iosched.c:4430).
> 4425
> 4426 static void __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
> 4427 {
> 4428 struct bfq_queue *bfqq = RQ_BFQQ(rq), *new_bfqq;
> 4429
> 4430 assert_spin_locked(&bfqd->lock);
> 4431
> 4432 bfq_log_bfqq(bfqd, bfqq, "__insert_req: rq %p bfqq %p", rq, bfqq);
> 4433
> 4434 /*
>

Hi Bart,
I've tried to figure out how to deal with this crash, but I didn't
find any sensible way to go, for the following two reasons.

First, if I'm not missing anything, then I don't yet have the hardware
required to run the srp-test. So, I cannot easily reproduce this
failure. Actually, BFQ is not yet suitable, and maybe will never be
in its current design, for very high-speed hardware as InfiniBand and
NVMe devices.

Second, a NULL-pointer fault at the line you report is rather weird.
In fact, the sequence of C-code instructions executed up to that line
is:

struct bfq_data *bfqd = q->elevator->elevator_data;
...
spin_lock_irq(&bfqd->lock);
__bfq_insert_request(bfqd, rq);
/* inside the __bfq_insert_request function: */
struct bfq_queue *bfqq = RQ_BFQQ(rq), ...;
assert_spin_locked(&bfqd->lock);

So, how can the last line cause a NULL-pointer-dereference exception
on the same address, &bfqd->lock, on which spin_lock_irq(&bfqd->lock);
was happy to work to get a spin lock?

Any idea on how to proceed? If this strage bug remains hard to spot,
then, if you agree, I will go on in the meanwhile with submitting a
new version of the patch series, which addresses your other issues.

Thanks,
Paolo

> Bart.