Re: ext3_orphan_del may double-decrement bh->b_count

From: Chris Mason
Date: Thu Jun 03 2004 - 13:22:41 EST


On Wed, 2004-06-02 at 22:43, Andrew Morton wrote:
> Chris Mason <mason@xxxxxxxx> wrote:
> >
> > > You need the buffer-tracing patch. This is against 2.6.7-rc2. It should
> > > spit a nice trace when you hit the problem. It'll tell us how that buffer
> > > got itself not uptodate.
> >
> > Thanks. jeffm had worked out something similar that stored the EIP of
> > each bit operation, the uptodate bit seems to have turned off all on its
> > own. Once we can reproduce reliably on local boxes, we'll start
> > layering on the debugging code.
>
> buffer-trace code is what you need - it records the bh's internal state in
> its trace buffer too, replays it all when you hit an assertion failure.
>
I think the buffers are just victims of slab corruption. With slab
poisoning on I managed to get proof that 128 byte slabs are getting
stomped on. I'm still searching for a reliable trigger though, tests
after that haven't failed at all.

> > No triggers yet, I might have to grab a bigger machine in the morning.
>
> Is direct-io involved? I just discovered that clean_blockdev_aliases() is
> invalidating too many blocks. It tends to munch those indirect blocks.
> (What does bonnie++'s -f option do?)
>
skips the per char test. There's no O_DIRECT. Thanks for the patch
though ;-)

-chris


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/