Re: elevator messages in 2.3.50

From: Andrea Arcangeli (andrea@suse.de)
Date: Wed Mar 08 2000 - 21:18:15 EST


On Thu, 9 Mar 2000, Jamie Lokier wrote:

>Andrea Arcangeli wrote:
>> I am now in the process of partially rewriting the request merging code so
>> that I don't seek back and forth if not necessary. I am trying to exploit
>> all possible optimizations.
>
>You mean ordering to minimise seek time?

Sorry I am been not clear. I meant seek in the in-core queue, all in
memory. I am not changing the result of the algorithm (the disk is driven
in the same way). I am cleaning up the implementation. See the function
seek_to_not_starving_chunk(). Such function is gone away in my tree. I was
going backwards in such function (seeking to the starving entry) and then
I was always doing the real work forward. Now I do all the real work
backwars and such function gone away and I browse the list at least one
less time (two less times if there wasn't a request available and we had
to drop the spinlock).

>A possible improvement would be a small holdoff time between one request
>and the next, if a large seek would be involved. The idea here is to
>support applications which do a sequence of reads in a local region,
>where each read is not queued until the previous one completes. The
>hottest example is page-in but there are others.

This is usually handled by readahead (we do readahead also during paging).

>The holdoff time would be just enough to permit an application to queue
>the next request, if it is going to do that immediately. E.g. I have
>process A doing lots of I/O. Details not important.
>
>I also have process B. It reads something, sleeps, wakes up and quickly
>issues a read for the next thing. That might be via paging or explicit
>reads.
>
>To minimise overall seek time, it is probably better to _not_ schedule
>an I/O from process A in the short time between process B's two
>requests, _if_ A's request would imply a large seek. However, the only

Definitely. And A's request almost always imply a seek. By using
readahead, B should make the two requests at the same time.

>way to do that is to let the device idle for a short time. Sure it's
>idle, but overall seek time is reduced.

Hans told me about something like that last month. That can make sense for
the indirect blocks where we can't do readahead on all the lower level
indirect blocks but we know that if the fs did a good job the indirect
blocks will be near each other.

But there's a first problem that it looks too much dependent on the
timings (we need to know about the timings of a sync-read). Then it's
possible to do that only with fs support that gives an hint to the
elevator, otherwise you have _no_way_ to know there will be a request any
time soon (and waiting without knowing there will be a near read veyr soon
is not an option). Then with a lvm/raid array under you may do a very
wrong thing delaying the very far seek because there could be no hardware
seek but only a disk change. As last thing rescheduling may cause delay
due high cpu load etc...

So basically I believe it's a kernel-bloat-hack not worty to implement.

And implementing extents in the filesystem is going to take care of the
indirect blocks issue anyway.

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 15 2000 - 21:00:14 EST