Re: [Fwd: [RFC] cfq: adapt slice to number of processes doing I/O(v2.1)]

From: Shan Wei
Date: Thu Sep 24 2009 - 05:21:34 EST


> Subject:
> [RFC] cfq: adapt slice to number of processes doing I/O (v2.1)
>
> When the number of processes performing I/O concurrently increases,
> a fixed time slice per process will cause large latencies.
>
> This (v2.1) patch will scale the time slice assigned to each process,
> according to a target latency (tunable from sysfs, default 300ms).
>
> In order to keep fairness among processes, we adopt two devices, w.r.t. v1.
>
> * The number of active processes is computed using a special form of
> running average, that quickly follows sudden increases (to keep latency low),
> and decrease slowly (to have fairness in spite of rapid decreases of this
> value).
>
> * The idle time is computed using remaining slice as a maximum.
>
> To safeguard sequential bandwidth, we impose a minimum time slice
> (computed using 2*cfq_slice_idle as base, adjusted according to priority
> and async-ness).
>
> Signed-off-by: Corrado Zoccolo <czoccolo@xxxxxxxxx>
>

I'm interested in the idea of dynamically tuning the time slice according to the number of
processes. I have tested your patch using Jeff's tool on kernel-2.6.30-rc4.
>From the following test result, after applying your patch, the fairness is not good
as original kernel, e.g. io priority of 4 vs 5 in be0-through-7.fio case.
And the throughout(total data transferred) becomes lower.
Have you tested buffered write, multi-threads?

Additionally i have a question about the minimum time slice, see the comment in your patch.

*Original*(2.6.30-rc4 without patch):
/cfq-regression-tests/2.6.30-rc4-log/be0-through-7.fio
total priority: 880
total data transferred: 535872
class prio ideal xferred %diff
be 0 109610 149748 36
be 1 97431 104436 7
be 2 85252 91124 6
be 3 73073 64244 -13
be 4 60894 59028 -4
be 5 48715 38132 -22
be 6 36536 21492 -42
be 7 24357 7668 -69

/cfq-regression-tests/2.6.30-rc4-log/be0-vs-be1.fio
total priority: 340
total data transferred: 556008
class prio ideal xferred %diff
be 0 294357 402164 36
be 1 261650 153844 -42

/cfq-regression-tests/2.6.30-rc4-log/be0-vs-be7.fio
total priority: 220
total data transferred: 537064
class prio ideal xferred %diff
be 0 439416 466164 6
be 7 97648 70900 -28

/cfq-regression-tests/2.6.30-rc4-log/be4-x-3.fio
total priority: 300
total data transferred: 532964
class prio ideal xferred %diff
be 4 177654 199260 12
be 4 177654 149748 -16
be 4 177654 183956 3

/cfq-regression-tests/2.6.30-rc4-log/be4-x-8.fio
total priority: 800
total data transferred: 516384
class prio ideal xferred %diff
be 4 64548 78580 21
be 4 64548 76436 18
be 4 64548 75764 17
be 4 64548 70900 9
be 4 64548 42388 -35
be 4 64548 73780 14
be 4 64548 30708 -53
be 4 64548 67828 5


*Applied patch*(2.6.30-rc4 with patch):

/cfq-regression-tests/log-result/be0-through-7.fio
total priority: 880
total data transferred: 493824
class prio ideal xferred %diff
be 0 101009 224852 122
be 1 89786 106996 19
be 2 78562 70388 -11
be 3 67339 38900 -43
be 4 56116 18420 -68
be 5 44893 19700 -57
be 6 33669 9972 -71
be 7 22446 4596 -80

/cfq-regression-tests/log-result/be0-vs-be1.fio
total priority: 340
total data transferred: 537064
class prio ideal xferred %diff
be 0 284328 375540 32
be 1 252736 161524 -37

/cfq-regression-tests/log-result/be0-vs-be7.fio
total priority: 220
total data transferred: 551912
class prio ideal xferred %diff
be 0 451564 499956 10
be 7 100347 51956 -49

/cfq-regression-tests/log-result/be4-x-3.fio
total priority: 300
total data transferred: 509404
class prio ideal xferred %diff
be 4 169801 196596 15
be 4 169801 198388 16
be 4 169801 114420 -33

/cfq-regression-tests/log-result/be4-x-8.fio
total priority: 800
total data transferred: 459072
class prio ideal xferred %diff
be 4 57384 70644 23
be 4 57384 52980 -8
be 4 57384 62356 8
be 4 57384 60660 5
be 4 57384 55028 -5
be 4 57384 69620 21
be 4 57384 51956 -10
be 4 57384 35828 -38


Hardware infos.:
CPU: GenuineIntel Intel(R) Xeon(TM) CPU 3.00GHz
(4 logic cpus with hyper-thread on)
memory:2G
HDD:scsi

> ---
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 0e3814b..ca90d42 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -27,6 +27,8 @@ static const int cfq_slice_sync = HZ / 10;
> static int cfq_slice_async = HZ / 25;
> static const int cfq_slice_async_rq = 2;
> static int cfq_slice_idle = HZ / 125;
> +static int cfq_target_latency = HZ * 3/10; /* 300 ms */
> +static int cfq_hist_divisor = 4;
>
> /*
> * offset from end of service tree
> @@ -134,6 +136,9 @@ struct cfq_data {
> struct rb_root prio_trees[CFQ_PRIO_LISTS];
>
> unsigned int busy_queues;
> + unsigned int busy_queues_avg;
> + unsigned int busy_rt_queues;
> + unsigned int busy_rt_queues_avg;
>
> int rq_in_driver[2];
> int sync_flight;
> @@ -173,6 +178,8 @@ struct cfq_data {
> unsigned int cfq_slice[2];
> unsigned int cfq_slice_async_rq;
> unsigned int cfq_slice_idle;
> + unsigned int cfq_target_latency;
> + unsigned int cfq_hist_divisor;
>
> struct list_head cic_list;
>
> @@ -301,10 +308,40 @@ cfq_prio_to_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> return cfq_prio_slice(cfqd, cfq_cfqq_sync(cfqq), cfqq->ioprio);
> }
>
> +static inline unsigned
> +cfq_get_interested_queues(struct cfq_data *cfqd, bool rt) {
> + unsigned min_q, max_q;
> + unsigned mult = cfqd->cfq_hist_divisor - 1;
> + unsigned round = cfqd->cfq_hist_divisor / 2;
> + if (rt) {
> + min_q = min(cfqd->busy_rt_queues_avg, cfqd->busy_rt_queues);
> + max_q = max(cfqd->busy_rt_queues_avg, cfqd->busy_rt_queues);
> + cfqd->busy_rt_queues_avg = (mult * max_q + min_q + round) /
> + cfqd->cfq_hist_divisor;
> + return cfqd->busy_rt_queues_avg;
> + } else {
> + min_q = min(cfqd->busy_queues_avg, cfqd->busy_queues);
> + max_q = max(cfqd->busy_queues_avg, cfqd->busy_queues);
> + cfqd->busy_queues_avg = (mult * max_q + min_q + round) /
> + cfqd->cfq_hist_divisor;
> + return cfqd->busy_queues_avg;
> + }
> +}
> +
> static inline void
> cfq_set_prio_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> {
> - cfqq->slice_end = cfq_prio_to_slice(cfqd, cfqq) + jiffies;
> + unsigned process_thr = cfqd->cfq_target_latency / cfqd->cfq_slice[1];
> + unsigned iq = cfq_get_interested_queues(cfqd, cfq_class_rt(cfqq));
> + unsigned slice = cfq_prio_to_slice(cfqd, cfqq);
> +
> + if (iq > process_thr) {
> + unsigned low_slice = 2 * slice * cfqd->cfq_slice_idle
> + / cfqd->cfq_slice[1];

For sync queue, the minimum time slice is decided by slice_idle, base time slice and io priority.
But for async queue, why is the minimum time slice also limited by base time slice of sync queue?


Best Regards
-----
Shan Wei


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/