Re: 2.6.29 regression: ATA bus errors on resume

From: Niel Lambrechts
Date: Thu Jun 25 2009 - 11:27:51 EST


On 06/25/2009 02:57 PM, Tejun Heo wrote:
Sorry about the long delay.

The result is perfectly good and yeah dump_stack() on the issue path
would help but the problem is that block IO requests are processed
asynchronously so by the time we find out which request fail, the
requester stack is long gone. We can either record the stack trace
with each request or trace it back one step at a time by chasing down
the completion callbacks. The first requires more coding, so... :-)

Looks like the request gotta be coming from __breadahead(). The only
place this is used in ext4 is in __ext4_get_inode_loc(). Ah.. it also
contains the matching error message. I still don't see how the READA
buffer reads can affect the synchronous path. They're doing proper
exclusion via buffer lock. Maybe they're getting merged? Yeap, looks
like block code is merging READAs and regular READs.

Can you please try the attached patch and reproduce the problem and
report the kernel log? Hopefully, this will be the last debug run.

Hi Tejun,

I've recently switched my root partition from OpenSUSE 11.1 to Fedora 11 and since then I've not again seen the issue. I'm still using vanilla 2.6.30 generated with the same .config and EXT4 as before, so I have no idea why I cannot reproduce the issue. I still use hibernate + sleep frequently, and I just checked - I have 5 days uptime with a mount count of 20 and the file-system is still clean.

The one big difference is that my original partition was a EXT2 -> EXT3 -> EXT4 upgrade job over a long period of time, and some of the EXT4 parameters now used by Fedora 11 on the reformatted root partition are different from what I had then. Here is a summary of the differences in case it matters at all:

Current settings:
Default mount options: user_xattr acl
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Required extra isize: 28
Desired extra isize: 28
Default directory hash: half_md4

Previous settings:
Default mount options: (none)
Inodes per group: 8176
Inode blocks per group: 511
Default directory hash: tea

If I do notice any such errors again I'll apply the debug patch and let you know, but it does seem as if the upgrade made this issue disappear...

Regards,
Niel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/