Re: Performance regression in IO scheduler still there

From: Corrado Zoccolo
Date: Sun Nov 08 2009 - 12:01:58 EST


On Fri, Nov 6, 2009 at 7:56 PM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
> Jeff Moyer <jmoyer@xxxxxxxxxx> writes:
>                              Ârlat   Â|   rrlat    |   wlat    | Ârwlat
> kernel   | Thr | read Â| randr Â| write Â| randw Â|  Âavg, max   |  Âavg, max   |  avg, max   | avg,max
> ------------------------------------------------------------------------------------------------------------------------
> 2.6.29 Â Â | Â8 Â| 66.43 | Â20.52 | 296.32 | 214.17 | 22.330, 3106.47 | 70.026, 2804.02 | 4.817, 2406.65 | 1.420, 349.44
> Â Â Â Â Â | 16 Â| 63.28 | Â20.45 | 322.65 | 212.77 | 46.457, 5779.14 |137.455, 4982.75 | 8.378, 5408.60 | 2.764, 425.79
> ------------------------------------------------------------------------------------------------------------------------
> 2.6.32-rc6 | Â8 Â| 87.66 | 115.22 | 324.19 | 222.18 | 16.677, 3065.81 | 11.834, Â194.18 | 4.261, 1212.86 | 1.577, 103.20
> low_lat=0 Â| 16 Â| 94.06 | 49.65 Â| 327.06 | 214.74 | 30.318, 5468.20 | 50.947, 1725.15 | 8.271, 1522.95 | 3.064, Â89.16
> ------------------------------------------------------------------------------------------------------------------------
>

Jeff, Jens,
do you think we should try to do more auto-tuning of cfq parameters?
Looking at those numbers for SANs, I think we are being suboptimal in
some cases.
E.g. sequential read throughput is lower than random read.
In those cases, converting all sync queues in sync-noidle (as defined
in for-2.6.33) should allow a better aggregate throughput when there
are multiple sequential readers, as in those tiobench tests.
I also think that current slice_idle and slice_sync values are good
for devices with 8ms seek time, but they are too high for non-NCQ
flash devices, where "seek" penalty is under 1ms, and we still prefer
idling.
If we agree on this, should the measurement part (I'm thinking to
measure things like seek time, throughput, etc...) be added to the
common elevator code, or done inside cfq?
If we want to put it in the common code, maybe we can also remove the
duplication of NCQ detection, by publishing the NCQ flag from elevator
to the io-schedulers.

Thanks,
Corrado

>
> Cheers,
> Jeff
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/