Re: Apparent serious progressive ext4 data corruption bug in 3.6.3(and other stable branches?)

From: Theodore Ts'o
Date: Fri Oct 26 2012 - 17:10:32 EST


This looks very different. The symptoms are quite different, and it's
most likely that an unclean shutdown is involved. In your case,
you're doing clean shutdowns, with some suspend/resume cycles thrown
in. Also, kernel version 3.5.5 doesn't have the commits that were
added between 3.6.1 and 3.6.3.

Are you running e2fsck to fix the file system consistency problems;
what is e2fsck reporting?

Do you need to have a suspend/resume in order to trigger the problem?

This could very be some kind of hardware problem or kernel bug related
to suspend/resume. Unfortunately, many different problems get noticed
by the file system, but the root cause is can often be something else;
a hardware problem, or a bug somewhere else in the kernel.

Regards,

- Ted

P.S. Can you do us a favor and start a separate mail thread with the
information reposted? It's can get hard to track different cases when
a lot of people assume that their random failure (some of which are
hardware problems) are related to the issue we are trying to track
down in this mail thread and then they all pile onto the same mail
thread or the same web forum --- one of the reasons why I detest
Ubuntu Launchpad. Thanks!!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/