Re: Reporting a bug - Memory corruption in Linux kernel

From: Theodore Ts'o
Date: Thu Mar 06 2014 - 23:00:41 EST


On Fri, Mar 07, 2014 at 01:39:45AM +0530, Nilesh More wrote:
> Hi all,
>
> I am working on android bug wherein directory entries of ext4 file
> system get corrupted when USB is hotplugged (with auto mount support
> enabled).
>
> The logs as below:
> [ 413.607849] usb 2-1.1: USB disconnect, device number 12
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Hot plugged or hot unplugged? It looks like the problem is that the
block device disappeared out from under ext4. Maybe you have a flaky
SD/MMC drive (i.e., funky contacts, etc.)? Or maybe when you plug in
one USB device, the eMMC device where you have the mounted file system
disappeared?

> If I prevent kill_bdev from invalidating pages, I see a No-Repro for
> this bug. Also there are no prints saying invalid access to FAT
> entry(which were present when bug reproduces). Earlier we had no-repro
> when added delay(1) before _getblk.
>
> This points out to the loss of sync between _getblk and kill_bdev and
> ALSO looks like kill_bdev inadvertently invalidates pages which are
> Ext4 owned.

This looks like it's much more of a hardware issue than a software
issue. If you are plugging in a USB device, you should *not* be
getting a USB disconnect message. And the fact that the pages being
used by ext4 are getting invalidated would be consistent with the
theory that the USB device on which the ext4 file system was on is
somehow getting disconnected, per the message in you've shown in the
logs.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/