Re: [BUG]NULL pointer dereference at 0000000000000008 __blkdev_put+0x17f/0x1d0
From: Jack Wang
Date: Mon Jan 06 2014 - 03:48:00 EST
On 01/04/2014 07:09 AM, Al Viro wrote:
> On Thu, Jan 02, 2014 at 10:36:30AM +0100, Jack Wang wrote:
>>> Bug happened at line 1486, looks disk->fops is NULL here for some
>>> reason, is it reasonable to add a check like:
>>> if (disk->fops)
>>> if (disk->fops->release)
>>> ret = disk->fops->release(disk, mode);
>>> Happy New Year and Best regards:)
>> Ping, could you share opnions on this, attached with patch I proposaled.
> Sorry, had been sick since mid-December ;-/ The patch is not a good idea -
> in the best case it's papering over a bug (and insufficiently so, at that,
> since there are other places where disk->fops->some_method is checked).
> gendisk->fops should never be assigned NULL; it starts life with NULL
> ->fops, but that should be assigned a non-NULL value (and never modified
> afterwards) before anyone can see it. Moreover, even if some driver has
> fscked up and forgot to initialize the damn thing, get_gendisk() would've
> refused to return such a thing to any callers (including __blkdev_get()).
> Note that __blkdev_get() would oops on such a thing if get_gendisk()
> somehow returned it.
> Looks like something is shitting over bdev->bd_disk or bdev->bd_disk->fops.
> The offsets in the disassembled code are all wrong (including that from
> beginning of function to oopsing instruction), but the code match is good,
> so I agree that we are hitting bdev->bd_disk->fops == NULL here. The
> question is how it has happened - that's where the real bug is...
> How reproducible it is? And which kernel, while we are at it? This area
> didn't get a lot of changes lately, but still...
Thanks Al for reply, and look into this.
We're using 3.4.71, and this happened in production, we can not
reproduce it yet. What I could see is: before this happened, we saw scsi
devices offlined, and multipath failed path, raid1 failed member device.
Possible the bug lies in drivers md-raid1, dm-multipath or sd? How could
I narrow it down? Could you teach me?
Thanks, wish you happy and healthy!
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/