Re: 2.5.59-mm5

From: Andrew Morton (akpm@digeo.com)
Date: Fri Jan 24 2003 - 06:16:32 EST


Alex Bligh - linux-kernel <linux-kernel@alex.org.uk> wrote:
>
>
>
> --On 23 January 2003 19:50 -0800 Andrew Morton <akpm@digeo.com> wrote:
>
> > So what anticipatory scheduling does is very simple: if an application
> > has performed a read, do *nothing at all* for a few milliseconds. Just
> > return to userspace (or to the filesystem) in the expectation that the
> > application or filesystem will quickly submit another read which is
> > closeby.
>
> I'm sure this is a really dumb question, as I've never played
> with this subsystem, in which case I apologize in advance.
>
> Why not follow (by default) the old system where you put the reads
> effectively at the back of the queue. Then rather than doing nothing
> for a few milliseconds, you carry on with doing the writes. However,
> promote the reads to the front of the queue when you have a "good
> lump" of them.

That is the problem. Reads do not come in "lumps". They are dependent.
Consider the case of reading a file:

1: Read the directory.

   This is a single read, and we cannot do anything until it has
   completed.

2: The directory told us where the inode is. Go read the inode.

   This is a single read, and we cannot do anything until it has
   completed.

3: Go read the first 12 blocks of the file and the first indirect.

   This is a single read, and we cannot do anything until it has
   completed.

The above process can take up to three trips through the request queue.

In this very common scenario, the only way we'll ever get "lumps" of reads is
if some other processes come in and happen to want to read nearby sectors.
In the best case, the size of the lump is proportional to the number of
processes which are concurrently trying to read something. This just doesn't
happen enough to be significant or interesting.

But writes are completely different. There is no dependency between them and
at any point in time we know where on-disk a lot of writes will be placed.
We don't know that for reads, which is why we need to twiddle thumbs until the
application or filesystem makes up its mind.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jan 31 2003 - 22:00:11 EST