Re: [RFC] cfq: adapt slice to number of processes doing I/O

From: Jeff Moyer
Date: Thu Sep 03 2009 - 09:01:23 EST


Corrado Zoccolo <czoccolo@xxxxxxxxx> writes:

> When the number of processes performing I/O concurrently increases, a
> fixed time slice per process will cause large latencies.
> In the patch, if there are more than 3 processes performing concurrent
> I/O, we scale the time slice down proportionally.
> To safeguard sequential bandwidth, we impose a minimum time slice,
> computed from cfq_slice_idle (the idea is that cfq_slice_idle
> approximates the cost for a seek).
>
> I performed two tests, on a rotational disk:
> * 32 concurrent processes performing random reads
> ** the bandwidth is improved from 466KB/s to 477KB/s
> ** the maximum latency is reduced from 7.667s to 1.728
> * 32 concurrent processes performing sequential reads
> ** the bandwidth is reduced from 28093KB/s to 24393KB/s
> ** the maximum latency is reduced from 3.781s to 1.115s
>
> I expect numbers to be even better on SSDs, where the penalty to
> disrupt sequential read is much less.

Interesting approach. I'm not sure what the benefits will be on SSDs,
as the idling logic is disabled for them (when nonrot is set and they
support ncq). See cfq_arm_slice_timer.

> Signed-off-by: Corrado Zoccolo <czoccolo@gmail-com>
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index fd7080e..cff4ca8 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -306,7 +306,15 @@ cfq_prio_to_slice(struct cfq_data *cfqd, struct
> cfq_queue *cfqq)
> static inline void
> cfq_set_prio_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> - cfqq->slice_end = cfq_prio_to_slice(cfqd, cfqq) + jiffies;
> + unsigned low_slice = cfqd->cfq_slice_idle * (1 + cfq_cfqq_sync(cfqq));
> + unsigned interested_queues = cfq_class_rt(cfqq) ?
> cfqd->busy_rt_queues : cfqd->busy_queues;

Either my mailer displayed this wrong, or yours wraps lines.

> + unsigned slice = cfq_prio_to_slice(cfqd, cfqq);
> + if (interested_queues > 3) {
> + slice *= 3;

How did you come to this magic number of 3, both for the number of
competing tasks and the multiplier for the slice time? Did you
experiment with this number at all?

> + slice /= interested_queues;

Of course you realize this could disable the idling logic completely,
right? I'll run this patch through some tests and let you know how it
goes.

Thanks!

-Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/