Re: [RFC V2 PATCH 1/5] cfq-iosched: adapt slice to number of processes doing I/O

From: Corrado Zoccolo
Date: Wed Oct 21 2009 - 12:32:42 EST


Hi Jeff,
On Wed, Oct 21, 2009 at 5:57 PM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
> Hi, Corrado!
>
> Sorry if folks receive this twice, but my mailer and I had an argument
> about this message. Â;-)
>
> Corrado Zoccolo <czoccolo@xxxxxxxxx> writes:
>
>> When the number of processes performing I/O concurrently increases,
>> a fixed time slice per process will cause large latencies.
>>
>> This patch, if low_latency mode is enabled, Âwill scale the time slice
>> assigned to each process according to a 300ms target latency.
>>
>> In order to keep fairness among processes:
>> * The number of active processes is computed using a special form of
>> running average, that quickly follows sudden increases (to keep latency low),
>> and decrease slowly (to have fairness in spite of rapid decreases of this
>> value).
>>
>> To safeguard sequential bandwidth, we impose a minimum time slice
>> (computed using 2*cfq_slice_idle as base, adjusted according to priority
>> and async-ness).
>
> I like the idea as well, but I have a question and some nits to pick.
>
>> Âstatic inline void
>> Âcfq_set_prio_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
>> Â{
>> - Â Â cfqq->slice_end = cfq_prio_to_slice(cfqd, cfqq) + jiffies;
>> + Â Â unsigned slice = cfq_prio_to_slice(cfqd, cfqq);
>> + Â Â if (cfqd->cfq_latency) {
>> + Â Â Â Â Â Â unsigned iq = cfq_get_avg_queues(cfqd, cfq_class_rt(cfqq));
>> + Â Â Â Â Â Â unsigned process_thr = cfq_target_latency / cfqd->cfq_slice[1];
>> + Â Â Â Â Â Â if (iq > process_thr) {
>> + Â Â Â Â Â Â Â Â Â Â unsigned low_slice = 2 * slice * cfqd->cfq_slice_idle
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â / cfqd->cfq_slice[1];
>> + Â Â Â Â Â Â Â Â Â Â slice = max(slice * cfq_target_latency /
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â (cfqd->cfq_slice[1] * iq),
>
> Couldn't you have just divided the slice by iq? ÂAnd why iq? ÂWhy not
> nr_qs or avg_qlen or something? ÂIt's a minor nit; I can live with it.

iq stands for interested queues, because we are restricting the count
just to the same priority class, not all queues in the system.

>
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â min(slice, low_slice));
>> + Â Â Â Â Â Â }
>> + Â Â }
>> + Â Â cfqq->slice_end = jiffies + slice;
>> Â Â Â cfq_log_cfqq(cfqd, cfqq, "set_slice=%lu", cfqq->slice_end - jiffies);
>
> Wow. ÂThat function is *dense*. ÂI tried to write it in a more
> readable fashion, but please chime in if I misinterpreted anything.
>
> static inline void
> cfq_set_prio_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> Â Â Â Âunsigned slice = cfq_prio_to_slice(cfqd, cfqq);
>
> Â Â Â Âif (cfqd->cfq_latency) {
> Â Â Â Â Â Â Â Âunsigned iq = cfq_get_avg_queues(cfqd, cfq_class_rt(cfqq));
> Â Â Â Â Â Â Â Âunsigned slice_sync = cfqd->cfq_slice[1];
> Â Â Â Â Â Â Â Âunsigned process_thr = cfq_target_latency / slice_sync;
>
> Â Â Â Â Â Â Â Âif (iq > process_thr) {
> Â Â Â Â Â Â Â Â Â Â Â Â/*
> Â Â Â Â Â Â Â Â Â Â Â Â * Minimum slice is computed using 2*slice_idle as
> Â Â Â Â Â Â Â Â Â Â Â Â * a base, and then scaling it by priority and
> Â Â Â Â Â Â Â Â Â Â Â Â * async-ness.
> Â Â Â Â Â Â Â Â Â Â Â Â */
> Â Â Â Â Â Â Â Â Â Â Â Âunsigned total_sync = slice_sync * iq;
> Â Â Â Â Â Â Â Â Â Â Â Âunsigned slice_fraction = cfq_target_latency / total_sync;
> Â Â Â Â Â Â Â Â Â Â Â Âunsigned min_slice = (2 * cfqd->cfq_slice_idle) *
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â(slice / slice_sync);
> Â Â Â Â Â Â Â Â Â Â Â Âmin_slice = min(slice, min_slice);
> Â Â Â Â Â Â Â Â Â Â Â Âslice *= slice_fraction;
> Â Â Â Â Â Â Â Â Â Â Â Âslice = max(slice, min_slice);
> Â Â Â Â Â Â Â Â}
> Â Â Â Â}
> Â Â Â Âcfqq->slice_end = jiffies + slice;
> Â Â Â Âcfq_log_cfqq(cfqd, cfqq, "set_slice=%lu", cfqq->slice_end - jiffies);
> }
>
I don't think this is equivalent. You seem to compute some divisions
too early, losing in precision.
slice * cfq_target_latency / (cfqd->cfq_slice[1] * iq)
is not generally equivalent to:
slice * (cfq_target_latency / (cfqd->cfq_slice[1] * iq))
that is what you are computing.
There is an other such case in your simplification.

Corrado

>
> Cheers,
> Jeff
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/