Re: KASAN: null-ptr-deref Write in blk_mq_map_swqueue

From: Dongli Zhang
Date: Fri Mar 27 2020 - 09:30:54 EST




On 3/26/20 7:52 PM, Bart Van Assche wrote:
> On 2020-03-26 17:19, Dongli Zhang wrote:
>> I think the issue is because of line 2827, that is, the q->nr_hw_queues is
>> updated too earlier. It is still possible the init would fail later.
>>
>> 2809 static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
>> 2810 struct request_queue *q)
>> 2811 {
>> 2812 int i, j, end;
>> 2813 struct blk_mq_hw_ctx **hctxs = q->queue_hw_ctx;
>> 2814
>> 2815 if (q->nr_hw_queues < set->nr_hw_queues) {
>> 2816 struct blk_mq_hw_ctx **new_hctxs;
>> 2817
>> 2818 new_hctxs = kcalloc_node(set->nr_hw_queues,
>> 2819 sizeof(*new_hctxs), GFP_KERNEL,
>> 2820 set->numa_node);
>> 2821 if (!new_hctxs)
>> 2822 return;
>> 2823 if (hctxs)
>> 2824 memcpy(new_hctxs, hctxs, q->nr_hw_queues *
>> 2825 sizeof(*hctxs));
>> 2826 q->queue_hw_ctx = new_hctxs;
>> 2827 q->nr_hw_queues = set->nr_hw_queues;
>> 2828 kfree(hctxs);
>> 2829 hctxs = new_hctxs;
>> 2830 }
>
> Which kernel tree does this syzbot report refer to? Commit
> d0930bb8f46b ("blk-mq: Fix a recently introduced regression in
> blk_mq_realloc_hw_ctxs()") in Jens' tree removed line 2827 shown above.
>

Thank you very much for sharing this. The below is in Jens' tree for 5.7.

commit d0930bb8f46b8fb4a7d429c0bf1c91b3ed00a7cf
Author: Bart Van Assche <bvanassche@xxxxxxx>
Date: Mon Mar 9 21:26:18 2020 -0700

blk-mq: Fix a recently introduced regression in blk_mq_realloc_hw_ctxs()

q->nr_hw_queues must only be updated once it is known that
blk_mq_realloc_hw_ctxs() has succeeded. Otherwise it can happen that
reallocation fails and that q->nr_hw_queues is larger than the number of
allocated hardware queues. This patch fixes the following crash if
increasing the number of hardware queues fails:

BUG: KASAN: null-ptr-deref in blk_mq_map_swqueue+0x775/0x810
Write of size 8 at addr 0000000000000118 by task check/977

CPU: 3 PID: 977 Comm: check Not tainted 5.6.0-rc1-dbg+ #8
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0xa5/0xe6
__kasan_report.cold+0x65/0x99
kasan_report+0x16/0x20
check_memory_region+0x140/0x1b0
memset+0x28/0x40
blk_mq_map_swqueue+0x775/0x810
blk_mq_update_nr_hw_queues+0x468/0x710
nullb_device_submit_queues_store+0xf7/0x1a0 [null_blk]
configfs_write_file+0x1c4/0x250 [configfs]
__vfs_write+0x4c/0x90
vfs_write+0x145/0x2c0
ksys_write+0xd7/0x180
__x64_sys_write+0x47/0x50
do_syscall_64+0x6f/0x2f0
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: ac0d6b926e74 ("block: Reduce the amount of memory required per
request queue")
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx>
Cc: Keith Busch <kbusch@xxxxxxxxxx>
Cc: Johannes Thumshirn <jth@xxxxxxxxxx>
Cc: Hannes Reinecke <hare@xxxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>

diff --git a/block/blk-mq.c b/block/blk-mq.c
index d4bd9b961726..37ff8dfb8ab9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2824,7 +2824,6 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
memcpy(new_hctxs, hctxs, q->nr_hw_queues *
sizeof(*hctxs));
q->queue_hw_ctx = new_hctxs;
- q->nr_hw_queues = set->nr_hw_queues;
kfree(hctxs);
hctxs = new_hctxs;
}


That should be the reason why "init_hctx() fault injection" was introduced.

Thank you very much!

Dongli Zhang