Re: More on 2.2.18pre2aa2 (summary of elevator ideas)

From: Ed Tomlinson (tomlins@cam.org)
Date: Tue Sep 12 2000 - 17:08:41 EST

Next message: Andi Kleen: "Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee"
Previous message: Andries Brouwer: "Re: Test 8 Kernel Unable to get the password prompt?"
Next in thread: Jeff V. Merkey: "Re: More on 2.2.18pre2aa2 (summary of elevator ideas)"
Reply: Jeff V. Merkey: "Re: More on 2.2.18pre2aa2 (summary of elevator ideas)"
Reply: Andrew Scott: "Re: (reiserfs) Re: More on 2.2.18pre2aa2 (summary of elevator ideas)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

Geez, A simple comment on IRC can _really_ generate lots of feedback.
(There were over 50 messages about this in my queue - did not help
that some were duplicated three times <grin>).

I made the comment because I remember back when the discussion was current
on linux kernel. I thought Jeff Merkey's, message was to the point. Para-
phrasing from memory, it was something to the effect that novell had
tried many elevators. All had problems with some loads. The best they
had found was the 'close the door' idea. I do not remember if the door
was based on requests or time. Another point to remember is that the
netware people came up with a what they considered a good solution. From
Jeff's comment they arrived at this solution by experiments and bitter
experience. Maybe we can learn something from their research?

Here is what I glean from the thread:

>From all the discussion I find this suggestion from Alan to make lots of
sense. Think it can be made to work with number of request almost as easily
as with time...

>When you do the scan to decide where to insert the entry you dont consider
>insertion before the time. Also you keep two queue heads the real and the
>insert head. Whenever you walk from the insert head and find it points to
>a 'too old' entry you update the insert_head.

And this suggestion from Rik should counter most of Andrea's time vs requests
vs slow block devices issues. We just have to be sure to close the door after
atleast n request or m time. However, as pointed out by Chris Evan, later we
may not have to do this - there are stats that can give a good idea of a
device's latency.

>Not really. What about just using something like
>"half a second, but at least 10 requests liberty to
>reorder" ?

>It's simple, should be good enough for the user and
>allows for a little bit of reordering even on very
>slow devices ...

As Andrea points out its easy enought to do some sort of test with
the current code.

>Well changing that is very easy, you only have to change the unit of
>measure w/o changing one bit in the algorithm that is just implemented
>indeed.

>How

>Just assume the req->elevator_sequence to be calculate in jiffy and in
>elevator.c change the check for `!req->elevator_sequence' to
>`time_before(req->elevator_sequence, jiffies)'. Then of course change the
>initialization of req->elevator_sequence to be done with `jiffies +
>elevator->read_latency'. Then also elvtune will talk in jiffies and not in
>requests.

I wonder if using a wandering insert pointer, as Alan suggests, would give
lower overhead than the current implementation (and would it really help)?

Again from Alan,

Andrea> Now, I see people trying to introduce the concept of elapsed time
Andrea> that fix, which smells strongly of hack. How will this hack be cobbled
>
> Actually my brain says that elapsed time based scheduling is the right
> thing to do. It certainly works for networks

And from Chris Evans,

>Interesting, I'll try and run with this. The mention of networks reminds
>me that any "max service time" variable is a tunable quantity depending on
>current conditions..

>.. and sct's block device I/O accounting patches give us the current
>average request service time on a per-device basis. Multiply that up a bit
>and maybe you have your threshold for moving things to the head of the
>queue.

So we could end up using a figure that builds in both number of request and
time on a per device level without to much effort...

And Alan sums up the whole thing nicly with:

Andrea> Why do you say it's not been fixed? Can you still reproduce hangs long
Andrea> a write(2) can write? I certainly can't.

>I cant reproduce long hangs. Im not seeing as good I/O throughput as before
>but right now Im quite happy with the tradeoff. If someone can make it better
>then Im happier still

If the idea works, lead to simpler code, a more reponsive system maybe with
better benchmarks then its a winner. Only way we can be sure is to try it.

Thanks,

Ed Tomlinson <tomlins@cam.org> (ontadata on IRC)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

Next message: Andi Kleen: "Re: NFS locking bug -- limited mtime resolution means nfs_lock() does not provide coherency guarantee"
Previous message: Andries Brouwer: "Re: Test 8 Kernel Unable to get the password prompt?"
Next in thread: Jeff V. Merkey: "Re: More on 2.2.18pre2aa2 (summary of elevator ideas)"
Reply: Jeff V. Merkey: "Re: More on 2.2.18pre2aa2 (summary of elevator ideas)"
Reply: Andrew Scott: "Re: (reiserfs) Re: More on 2.2.18pre2aa2 (summary of elevator ideas)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Fri Sep 15 2000 - 21:00:19 EST