Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
On Tue, 2005-04-26 at 04:50 -0700, Andrew Morton wrote:
> Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
> >
> > When running
> > fsstress -v -d $DIR/tmp -n 1000 -p 1000 -l 2
> > on an ext2 filesystem with 1024 byte block size, on SMP i386 with 4096 byte
> > page size over loopback to an image file on a tmpfs filesystem, I would
> > very quickly hit
> > BUG_ON(!buffer_async_write(bh));
> > in fs/buffer.c:end_buffer_async_write
> > > > It seems that more than one request would be submitted for a given bh
> > at a time. __block_write_full_page looks like the culprit - with the
> > following patch things are very stable.
> > What's the bug? I don't see it.
>
Ah, the bug is that end_buffer_async_write first does
BUG_ON(!buffer_async_write(bh));
then a bit later does
clear_buffer_async_write(bh);
That's where it was blowing up for me, because end_buffer_async_write
was being run twice for that buffer.
Or did you mean *how* is it being run twice? I didn't exactly find
the stack traces involved, but I imagine that simply testing
buffer_async_write catches other requests in flight - ie. we've
lost track of exactly which ones we own.
How can such a thing come about? Both PageLocked() and PageWriteback() are
supposed to stop new writeback being started against the page.
<looks>
Were you using nobh? I guess not. What's to stop the new
mpage_writepage() from trying to write a page which is already under
PageWriteback()?
I don't think we understand this bug yet.