Re: [PATCH 1/1] blk-mq: get ctx in order to handle BLK_MQ_S_INACTIVE in blk_mq_get_tag()

From: Daniel Wagner
Date: Tue Jun 02 2020 - 03:51:31 EST


On Mon, Jun 01, 2020 at 11:17:49PM -0700, Dongli Zhang wrote:
> When scheduler is set, we hit below page fault when we offline cpu.
>
> [ 1061.007725] BUG: kernel NULL pointer dereference, address: 0000000000000040
> [ 1061.008710] #PF: supervisor read access in kernel mode
> [ 1061.009492] #PF: error_code(0x0000) - not-present page
> [ 1061.010241] PGD 0 P4D 0
> [ 1061.010614] Oops: 0000 [#1] SMP PTI
> [ 1061.011130] CPU: 0 PID: 122 Comm: kworker/0:1H Not tainted 5.7.0-rc7+ #2'
> ... ...
> [ 1061.013760] Workqueue: kblockd blk_mq_run_work_fn
> [ 1061.014446] RIP: 0010:blk_mq_put_tag+0xf/0x30
> ... ...
> [ 1061.017726] RSP: 0018:ffffa5c18037fc70 EFLAGS: 00010287
> [ 1061.018475] RAX: 0000000000000000 RBX: ffffa5c18037fcf0 RCX: 0000000000000004
> [ 1061.019507] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff911535dc1180
> ... ...
> [ 1061.028454] Call Trace:
> [ 1061.029307] blk_mq_get_tag+0x26e/0x280
> [ 1061.029866] ? wait_woken+0x80/0x80
> [ 1061.030378] blk_mq_get_driver_tag+0x99/0x110
> [ 1061.031009] blk_mq_dispatch_rq_list+0x107/0x5e0
> [ 1061.031672] ? elv_rb_del+0x1a/0x30
> [ 1061.032178] blk_mq_do_dispatch_sched+0xe2/0x130
> [ 1061.032844] __blk_mq_sched_dispatch_requests+0xcc/0x150
> [ 1061.033638] blk_mq_sched_dispatch_requests+0x2b/0x50
> [ 1061.034239] __blk_mq_run_hw_queue+0x75/0x110
> [ 1061.034867] process_one_work+0x15c/0x370
> [ 1061.035450] worker_thread+0x44/0x3d0
> [ 1061.035980] kthread+0xf3/0x130
> [ 1061.036440] ? max_active_store+0x80/0x80
> [ 1061.037018] ? kthread_bind+0x10/0x10
> [ 1061.037554] ret_from_fork+0x35/0x40
> [ 1061.038073] Modules linked in:
> [ 1061.038543] CR2: 0000000000000040
> [ 1061.038962] ---[ end trace d20e1df7d028e69f ]---
>
> This is because blk_mq_get_driver_tag() would be used to allocate tag once
> scheduler (e.g., mq-deadline) is set. However, in order to handle
> BLK_MQ_S_INACTIVE in blk_mq_get_tag(), we need to set data->ctx for
> blk_mq_put_tag().
>
> Fixes: bf0beec0607db3c6 ("blk-mq: drain I/O when all CPUs in a hctx are offline")
> Signed-off-by: Dongli Zhang <dongli.zhang@xxxxxxxxxx>

Reviewed-by: Daniel Wagner <dwagner@xxxxxxx>