Re: Reduce latencies for syncronous writes and high I/O priority requests in deadline IO scheduler

From: Corrado Zoccolo
Date: Fri Apr 24 2009 - 12:08:12 EST


On Fri, Apr 24, 2009 at 8:39 AM, Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> I find your solution quite confusing - the statement is that it CFQ
> isn't optimal on SSD, so you modify deadline? ;-)

Well, I find CFQ too confusing to start with, so I chose a simpler one.
If I can prove something with deadline, maybe you will decide to
implement it also on CFQ ;)

>
> Most of the "CFQ doesn't work well on SSD" statements are mostly wrong.
> Now, you seem to have done some testing, so when you say that you
> probably have actually done some testing that tells you that this is the
> case. But lets attempt to fix that issue, then!
>
> One thing you pointed out is that CFQ doesn't treat the device as a
> "real" SSD unless it does queueing. This is very much on purpose, for
> two reasons:
>
> 1) I have never seen a non-queueing SSD that actually performs well for
> reads-vs-write situations, so CFQ still does idling for those.

Does CFQ idle only when switching between reads and writes, or even
when switching between reads from one process, and reads from an
other?
I think I'll have to instrument CFQ a bit to understand how it works.
Is there a better way instead of scattering printks all around?

> 2) It's a problem that is going away. SSD that are coming out today and
> in the future WILL definitely do queuing. We can attribute most of
> the crap behaviour to the lacking jmicron flash controller, which
> also has a crappy SATA interface.

I think SD cards will still be around a lot, and I don't expect them
to have queuing, so some support for them might still be needed.

> What I am worried about in the future is even faster SSD devices. CFQ is
> already down a percent or two when we are doing 100k iops and such, this
> problem will only get worse. So I'm very much interested in speeding up
> CFQ for such devices, which I think will mainly be slimming down the IO
> path and bypassing much of the (unneeded) complexity for them. The last
> thing I want is to have to tell people to use deadline or noop on SSD
> devices.
>

Totally agree. Having the main IOscheduler perform good on most
scenarios is surely needed.
But this could be achieved in various ways.
What if the main IO scheduler had in his toolbox various strategies,
and could switch between them based on the workload or type of
hardware?
FIFO scheduling for reads could be one such strategy, used only when
the conditions are good for it.
An other possibility is to use auto-tuning strategies, but those are
more difficult to devise and test.

>> In the meantime, I wanted to overcome also deadline limitations, i.e.
>> the high latencies on fsync/fdatasync.
>
> This is very much something you could pull out of the patchset and we
> could include without much questioning.
>

Ok, this is the first patch of the series, and contains code cleanup
needed before changing read/write to sync/async. No behavioral change
is introduced by this patch.

I found where the random read performance is gained, but I didn't
include in this patch, because it require sync/async separation to not
negatively impact sync write latencies.

If the following new code, that replicates existing behaviour:
if (!dd->next_rq
|| rq_data_dir(dd->next_rq) != data_dir
|| deadline_check_fifo(dd, data_dir)) {
/*
* A deadline has expired, the last request was in the other
* direction, or we have run out of higher-sectored requests.
is changed to:
if (!dd->next_rq
|| rq_data_dir(dd->next_rq) > data_dir
|| deadline_check_fifo(dd, data_dir)) {
/*
* A deadline has expired, the last request was less
important (where WRITE is less important than READ),
* or we have run out of higher-sectored requests.

you get both higher random read throughput and higher write latencies.

Corrado

> --
> Jens Axboe
>
>

--
__________________________________________________________________________

dott. Corrado Zoccolo mailto:czoccolo@xxxxxxxxx
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------

Attachment: deadline-patch-cleanup
Description: Binary data