Re: [BUG]: Ext2 Corruption in test10pre3 (incl. Oops)

From: Alexander Viro (viro@math.psu.edu)
Date: Tue Oct 17 2000 - 13:38:00 EST


On Tue, 17 Oct 2000, Linus Torvalds wrote:

> and the above is a perfectly fine backtrace, makes tons of sense, looks
> good.

Except the strange beast between ext2_create() and ext2_new_inode().

> HOWEVER. What doesn't make any sense at all is that bread() calls getblk()
> to find the buffer, which in turn certainly makes sure that the buffer it
> tries to read is mapped. In fact, there are two paths to the read: one
> finds the buffer off the hash queue, and the other creates it. The one
> that creates the buffer explicitly marks it BH_Mapped, so the only
> apparent source of problems would be the hash queue.
>
> Except for the fact that the only thing that adds buffers to the hash
> queue is __insert_into_queues(), and the only thing that calls THAT is
> getblk() itself - again after having marked the buffer mapped.
>
> In short, the debug trace looks fine, but it also looks completely
> incomprehensible. The only thing that would strike me is
> - memory corruption
> - somebody calls "unmap_buffer()" in a buffer that is hashed. Which we
> used to have as a bug, but we definitely don't do that any more.
> - we have buffer head list corruption going on.

 - we got a page-bound bh on free_list and called block_flushpage() on
that page. But yes, it defintiely counts as a buffer head list corruption.

> Now, I don't see any recent code that has touched anything like this,
> which obviously doesn't mean anything at all. It might be a very old bug
> that just hasn't reared its head before now.
>
> Al, do you see anything wrong?

        See another posting. More or less the same analysis. I don't see
where it came from and it smells funny - looks like a loss of ->b_count
_or_ an active page returned by alloc_page() (to grow_buffers()). I
wouldn't exclude the latter, BTW, but then I'm still not too familiar with
Rik's changes to VM, so it's just a nodding to the area I don't grok right
now.

> Udo, any idea what you are doing differently than anybody else to see
> this thing? Any special usage patterns that seem to bring on the trouble?

BTW, sorry for a stupid question, but... was it the first oops? If it was
an aftermath of something else...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Oct 23 2000 - 21:00:11 EST