Re: NULL deref around blkmq in v4.0-rc1ârc7
From: Linus Torvalds
Date: Thu Apr 09 2015 - 17:12:30 EST
On Thu, Apr 9, 2015 at 11:24 AM, Jan Engelhardt <jengelh@xxxxxxx> wrote:
>
> It's fairly consistent (reproducible?). Only 1 in 15 or so (have not kept track
> really) attempts does it not die.
>
> With frame pointers:
> [<ffffffff81286d59>] scsi_queue_rq+0x2e8/0x3d2
> [<ffffffff8119e64d>] __blk_mq_run_hw_queue+0x19b/0x2a2
> [<ffffffff8119e901>] ? blk_mq_merge_queue_io+0x75/0x147
> [<ffffffffa00fa34a>] ? __xfs_get_blocks+0x2f9/0x2f9 [xfs]
> [<ffffffff8119edeb>] blk_mq_run_hw_queue+0x4f/0x99
> [<ffffffff8119fab9>] blk_sq_make_request+0x163/0x170
Ok, good.
So the cmd comes from
struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req);
which in turn is just
return (void *) rq + sizeof(*rq);
which in turn is written by some crazy monkey on crack. That's some
shit code. Why the hell you'd write it that way, when the natural
thing to do would be just
return rq + 1;
without the sizeof, and without the cast.
The particular crazy monkey on crack is Jens Axboe, in commit 320ae51feed5c.
Jens, really. This code is shit.
That ->sense_buffer thing is supposed to be initialized by the
blk_mq_ops.init_request() function, which is called - if it exists =
when the array of requests ('->rqs[]') is initialized.
And that code too looks like crap. It seems to be very clever, trying
to allocaet big contiguous chunks of RAM for the requests, but then
the initialization sequence is questionable as hell. It takes that
nonzeroed allocation, and zeroes a few fields randomly. The rest will
contain whatever garbage data they used to.
Does this entirely untested patch make any difference?
And Jens, this all really looks very fishy. When I look at these kinds
of core functions, and find just *stupid* code like this, it makes me
unhappy.
Anyway, I assume that the actual "disk" in question is mpt fusion?
Linus
block/blk-mq.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index b7b8933ec241..33c428530193 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1457,7 +1457,7 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set,
do {
page = alloc_pages_node(set->numa_node,
- GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
+ GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO,
this_order);
if (page)
break;
@@ -1479,8 +1479,6 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set,
left -= to_do * rq_size;
for (j = 0; j < to_do; j++) {
tags->rqs[i] = p;
- tags->rqs[i]->atomic_flags = 0;
- tags->rqs[i]->cmd_flags = 0;
if (set->ops->init_request) {
if (set->ops->init_request(set->driver_data,
tags->rqs[i], hctx_idx, i,