Re: BUG in VFS or block layer

From: Alan Stern
Date: Wed Aug 06 2008 - 18:41:20 EST


On Wed, 6 Aug 2008, Andrew Morton wrote:

> What the VFS will do is
>
> - lock the page
>
> - put the page into a BIO and send it down to the block layer
>
> - later, wait for IO completion. It does this by running
> lock_page[_killable](), which will waiting for the page to come unlocked.
>
> The page comes unlocked via the device driver, usually within the
> IO completion interrupt.
>
>
> A common cause of userspace lockups during IO errors is that the driver
> layer screwed up and didn't run the completion callback.
>
> Now, according to the above trace, the above code sequence _did_ work
> OK. Or at least, it ran to completion. It was later, when we tried to
> truncate a file that we stumbled across a permanently-locked page.
>
> So it would appear that the VFS read() code successfully completed, but
> left locked pages behind it, which caused the truncate to hang.

...

> One possible problem is here:
>
> readpage:
> /* Start the actual read. The read will unlock the page. */
> error = mapping->a_ops->readpage(filp, page);
>
> if (unlikely(error)) {
> if (error == AOP_TRUNCATED_PAGE) {
> page_cache_release(page);
> goto find_page;
> }
> goto readpage_error;
> }
>
> the VFS layer assumes that if ->readpage() returned a synchronous error
> then the page was already unlocked within ->readpage(). Usually this
> means that the driver layer had to run the BIO completion callback to
> do that unlocking. It is possible that the USB code forgot to do this.
> This would explain what you're seeing.
>
> So... would you be able to verify that the USB, layer is correctly
> calling bio->bi_end_io() for the offending requests?

The USB layer doesn't handle that; the SCSI layer takes care of it.
Possibly the I/O error confuses the code in and around
scsi_end_request(). I'll have to do some testing to find out.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/