Re: NULL deref around xfs in v4.0-rc1ârc7

From: Linus Torvalds
Date: Thu Apr 09 2015 - 13:38:39 EST


On Wed, Apr 8, 2015 at 8:20 AM, Jan Engelhardt <jengelh@xxxxxxx> wrote:
> On Wednesday 2015-04-08 15:41, Jan Engelhardt wrote:
>
>>Starting somewhere around v4.0-rc1 and persisting through commit
>>v4.0-rc7, there is a new NULL deference apparently happening in
>>conjunction with xfs. This inhibits this machine's booting,
>>as xfs is used for the root filesystem.
>>
>>First bisection points at first-bad commit v4.0-rc1~8, and since that is
>>a merge commit, I'll be investigating some more hand-chosen commits (and
>>then people to Cc) as we speak.
>
> I reran bisect just to be sure.
> It now shows v4.0-rc1~9 is bad, v4.0-rc1~9^1 is ok, and v4.0-rc~9^2 is
> ok too. So this means that the combination of the both ~9 childs work
> badly together.

Ok, that's just _odd_.

That v4.0-rc1~9 is just the pm+acpi merge, and has absolutely nothing
to do with XFS or the block code. In fact, looking at the diff from
it's direct parent, it doesn't even really change any relevant code.

So I get the feeling that the oops you are seeing is likely not
consistent, and may depend on allocation patterns or similar. Because
the bisect doesn't make any sense at all.

It looks much more like a pure block-mq bug, but one that needs some
very special condition to trigger.

Jens, does this look familiar or trigger any ideas:

BUG: unable to handle kernel paging request at 0000000000001000
IP: [<ffffffff812718d0>] scsi_init_cmd_errh+0x26/0x5d

(The whole oops is on lkml).

Jan, can you reproduce the oops with frame pointers so that we get a
better call trace? Although it looks fairly normal: the trapping code
is

rep stos %eax,%es:*(%rdi)

and %rdi is 0x1000. It seems to be simply

memset(cmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);

where 'cmd->sense_buffer' has some insane value ("PAGE_SIZE" or just a
flipped bit, or whatever)

Jens?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/