Re: [PATCH 00/32] Adaptive readahead V14

From: Jens Axboe
Date: Tue May 30 2006 - 08:26:45 EST


On Tue, May 30 2006, Wu Fengguang wrote:
> On Tue, May 30, 2006 at 11:23:10AM +0200, Jens Axboe wrote:
> > On Mon, May 29 2006, Wu Fengguang wrote:
> > > On Sun, May 28, 2006 at 11:23:33PM +0400, Michael Tokarev wrote:
> > > > Wu Fengguang wrote:
> > > > >
> > > > > It's not quite reasonable for readahead to worry about media errors.
> > > > > If the media fails, fix it. Or it will hurt read sooner or later.
> > > >
> > > > Well... In reality, it is just the opposite.
> > > >
> > > > Suppose there's a CD-rom with a scratch/etc, one sector is unreadable.
> > > > In order to "fix" it, one have to read it and write to another CD-rom,
> > > > or something.. or just ignore the error (if it's just a skip in a video
> > > > stream). Let's assume the unreadable block is number U.
> > > >
> > > > But current behavior is just insane. An application requests block
> > > > number N, which is before U. Kernel tries to read-ahead blocks N..U.
> > > > Cdrom drive tries to read it, re-read it.. for some time. Finally,
> > > > when all the N..U-1 blocks are read, kernel returns block number N
> > > > (as requested) to an application, successefully.
> > > >
> > > > Now an app requests block number N+1, and kernel tries to read
> > > > blocks N+1..U+1. Retrying again as in previous step.
> > > >
> > > > And so on, up to when an app requests block number U-1. And when,
> > > > finally, it requests block U, it receives read error.
> > > >
> > > > So, kernel currentry tries to re-read the same failing block as
> > > > many times as the current readahead value (256 (times?) by default).
> > >
> > > Good insight... But I'm not sure about it.
> > >
> > > Jens, will a bad sector cause the _whole_ request to fail?
> > > Or only the page that contains the bad sector?
> >
> > Depends entirely on the driver, and that point we've typically lost the
> > fact that this is a read-ahead request and could just be tossed. In
> > fact, the entire request may consist of read-ahead as well as normal
> > read entries.
> >
> > For ide-cd, it tends do only end the first part of the request on a
> > medium error. So you may see a lot of repeats :/
>
> Another question about it:
> If the block layer issued a request, which happened to contain
> R ranges of B bad blocks, i.e. 3 ranges of 9 bad-blocks:
> ___b_____bb___________bbbbbb____
> How many retries will incur? 1, 3, 9, or something else?
> If it is 3 or more, then we are even more bad luck :(

Again, this is driver specific. But for ide-cd, if it's using PIO the
right thing should happen since we do each chunk individually. For dma
it looks much worse, since we only get an EIO back from the hardware for
the entire range. It wont do the right thing at all, only for the very
last thing when get get past the last bbbbbb block.

> Will it be suitable to _automatically_ apply the following retracting
> policy on I/O error? Please comment if there's better ways:

Probably it should be even more aggressively scaling down. The real
problem is the drivers of course, we should spend some time fixing them
up too.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/