Re: Reduce latencies for syncronous writes and high I/O priority requests in deadline IO scheduler

From: Corrado Zoccolo
Date: Thu Apr 23 2009 - 12:10:56 EST


On Thu, Apr 23, 2009 at 1:52 PM, Aaron Carroll <aaronc@xxxxxxxxxxxxxxx> wrote:
> Corrado Zoccolo wrote:
>> Hi,
>> deadline I/O scheduler currently classifies all I/O requests in only 2
>> classes, reads (always considered high priority) and writes (always
>> lower).
>> The attached patch, intended to reduce latencies for syncronous writes
>
> Can be achieved by switching to sync/async rather than read/write. No
> one has shown results where this makes an improvement. Let us know if
> you have a good example.

Yes, this is exactly what my patch does, and the numbers for
fsync-tester are much better than baseline deadline, almost comparable
with cfq.

>
>> and high I/O priority requests, introduces more levels of priorities:
>> * real time reads: highest priority and shortest deadline, can starve
>> other levels
>> * syncronous operations (either best effort reads or RT/BE writes),
>> mid priority, starvation for lower level is prevented as usual
>> * asyncronous operations (async writes and all IDLE class requests),
>> lowest priority and longest deadline
>>
>> The patch also introduces some new heuristics:
>> * for non-rotational devices, reads (within a given priority level)
>> are issued in FIFO order, to improve the latency perceived by readers
>
> This might be a good idea.
I think Jens doesn't like it very much.
> Can you make this a separate patch?
I have an earlier attempt, much simpler, at:
http://lkml.indiana.edu/hypermail/linux/kernel/0904.1/00667.html
> Is there a good reason not to do the same for writes?
Well, in that case you could just use noop.
I found that this scheme outperforms noop. Random writes, in fact,
perform quite bad on most SSDs (unless you use a logging FS like
nilfs2, that transforms them into sequential writes), so having all
the deadline ioscheduler machinery to merge write requests is much
better. As I said, my patched IO scheduler outperforms noop on my
normal usage.


>> * minimum batch timespan (time quantum): partners with fifo_batch to
>> improve throughput, by sending more consecutive requests together. A
>> given number of requests will not always take the same time (due to
>> amount of seek needed), therefore fifo_batch must be tuned for worst
>> cases, while in best cases, having longer batches would give a
>> throughput boost.
>> * batch start request is chosen fifo_batch/3 requests before the
>> expired one, to improve fairness for requests with lower start sector,
>> that otherwise have higher probability to miss a deadline than
>> mid-sector requests.
>
> I don't like the rest of it. I use deadline because it's a simple,
> no surprises, no bullshit scheduler with reasonably good performance
> in all situations. Is there some reason why CFQ won't work for you?

I actually like CFQ, and use it almost everywhere, and switch to
deadline only when submitting an heavy-duty workload (having a SysRq
combination to switch I/O schedulers could sometimes be very handy).

However, on SSDs it's not optimal, so I'm developing this to overcome
those limitations.
In the meantime, I wanted to overcome also deadline limitations, i.e.
the high latencies on fsync/fdatasync.

Corrado

--
__________________________________________________________________________

dott. Corrado Zoccolo mailto:czoccolo@xxxxxxxxx
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/