Re: stochastic fair queueing in the elevator [Re: [BENCHMARK] 2.4.20-ck3 / aa / rmap with contest]

From: Nick Piggin (piggin@cyberone.com.au)
Date: Sun Feb 09 2003 - 23:44:35 EST


Rik van Riel wrote:

>On Mon, 10 Feb 2003, Nick Piggin wrote:
>
>>Andrea Arcangeli wrote:
>>
>>
>>>The only way to get the minimal possible latency and maximal fariness is
>>>my new stochastic fair queueing idea.
>>>
>>Sounds nice. I would like to see per process disk distribution.
>>
>
>Sounds like the easiest way to get that fair, indeed. Manage
>every disk as a separately scheduled resource...
>
I hope this option becomes available one day.

>
>
>>However dependant reads can not merge with each other obviously so
>>you could end up say submitting 4K reads per process.
>>
>
>Considering that one medium/far disk seek counts for about 400 kB
>of data read/write, I suspect we'll just want to merge requests or
>put adjacant requests next to each other into the elevator up to
>a fairly large size. Probably about 1 MB for a hard disk or a cdrom,
>but much less for floppies, opticals, etc...
>
Yes, but the point is _dependant reads_. This is why Andrea's approach
alone can't solve dependant read vs write or nondependant read - while
maintaining a good throughput.

>
>
>>But your solution also does nothing for sequential IO throughput in
>>the presence of more than one submitter.
>>
>
>>I think you should be giving each process a timeslice,
>>
>
>That is the anticipatory scheduler. A good complement to the SFQ
>part of the IO scheduler. I'd really like to see both ideas together
>in one scheduler, they sound like a winning pair.
>
No, anticipatory scheduling just "pauses for a bit after some reads"
I am talking about all this nonsense about trying to measure seek vs
throughput vs rotational latency, etc. We don't process schedule by
how many memory accesses a process makes * processor speed / memory
speed + instruction jumps.....

If you want a fair disk scheduler, just give each process a disk
timeslice, and all your seek, stream, settle, track buffer, etc
accounting is magically done for you... For any device, and is
100% tunable. The point is, account by _time_.

Nick

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Feb 15 2003 - 22:00:25 EST