Re: [PATCH] cfq: Take whether cfq group is changed into account when choosing service tree

From: Corrado Zoccolo
Date: Fri Dec 11 2009 - 13:01:23 EST


Hi guys,
On Fri, Dec 11, 2009 at 4:07 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Fri, Dec 11, 2009 at 01:02:10PM +0800, Gui Jianfeng wrote:
>> Currently, with IO Controller introduced, CFQ chooses cfq group
>> at the top, and then choose service tree. So we need to take
>> whether cfq group is changed into account to decide whether we
>> should choose service tree start from scratch.
>>
>
> I am not able to understand the need/purpose of this patch. Once we
> switched the group during scheduling, why should we reset the order
> of workload with-in group.

I understand it, and in fact I was thinking about this.
The idea is the same as with priorities. If we have not serviced a group
for a while, we want to start with the no-idle workload to reduce its latency.

Unfortunately, a group may have a too small share, that could cause some
workloads to be starved, as Vivek correctly points out, and we should
avoid this.
It should be easily reproducible testing a "many groups with mixed workloads"
scenario with group_isolation=1.

Moreover, even if the approach is groups on top, when group isolation
is not set (i.e. the default), in the non-root groups you will only
have the sync-idle
queues, so it is much more similar (logically) to a workload on top
than it appears
from the code.

I think the net result of this patch is, when group isolation is not
set, to select no-idle
workload first only when entering the root group, thus a slight
penalization of the
async workload.
Gui, if you observed improvements with this patch, probably you can obtain them
without the starvation drawback by making it conditional to !group_isolation.

BTW, since you always use cfqg_changed and prio_changed OR-ed together, you
can have just one argument, called e.g. boost_no_idle, and you pass
(!group_isolation && cfqg_changed) || prio_changed.

> In fact we don't want to do that and we
> want to start from where we left so that no workload with-in group is
> starved.

Agreed.

Thanks,
Corrado

>
> When a group is resumed, choose_service_tree() always checks if any RT
> queue got backlogged or not. If yes, it does choose that first.
>
> Can you give more details why do you want to do this change?
>
> Thanks
> Vivek
>
>> Signed-off-by: Gui Jianfeng <guijianfeng@xxxxxxxxxxxxxx>
>> ---
>> Âblock/cfq-iosched.c | Â 27 ++++++++++++++++++---------
>> Â1 files changed, 18 insertions(+), 9 deletions(-)
>>
>> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
>> index f3f6239..16084ca 100644
>> --- a/block/cfq-iosched.c
>> +++ b/block/cfq-iosched.c
>> @@ -1964,7 +1964,7 @@ static void cfq_setup_merge(struct cfq_queue *cfqq, struct cfq_queue *new_cfqq)
>>
>> Âstatic enum wl_type_t cfq_choose_wl(struct cfq_data *cfqd,
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â struct cfq_group *cfqg, enum wl_prio_t prio,
>> - Â Â Â Â Â Â Â Â Â Â Â Â Â Â bool prio_changed)
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â bool cfqg_changed, bool prio_changed)
>> Â{
>> Â Â Â struct cfq_queue *queue;
>> Â Â Â int i;
>> @@ -1972,7 +1972,7 @@ static enum wl_type_t cfq_choose_wl(struct cfq_data *cfqd,
>> Â Â Â unsigned long lowest_key = 0;
>> Â Â Â enum wl_type_t cur_best = SYNC_NOIDLE_WORKLOAD;
>>
>> - Â Â if (prio_changed) {
>> + Â Â if (cfqg_changed || prio_changed) {
>> Â Â Â Â Â Â Â /*
>> Â Â Â Â Â Â Â Â* When priorities switched, we prefer starting
>> Â Â Â Â Â Â Â Â* from SYNC_NOIDLE (first choice), or just SYNC
>> @@ -2001,7 +2001,8 @@ static enum wl_type_t cfq_choose_wl(struct cfq_data *cfqd,
>> Â Â Â return cur_best;
>> Â}
>>
>> -static void choose_service_tree(struct cfq_data *cfqd, struct cfq_group *cfqg)
>> +static void choose_service_tree(struct cfq_data *cfqd,
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â struct cfq_group *cfqg, bool cfqg_changed)
>> Â{
>> Â Â Â enum wl_prio_t previous_prio = cfqd->serving_prio;
>> Â Â Â bool prio_changed;
>> @@ -2033,21 +2034,22 @@ static void choose_service_tree(struct cfq_data *cfqd, struct cfq_group *cfqg)
>> Â Â Â Â* expiration time
>> Â Â Â Â*/
>> Â Â Â prio_changed = (cfqd->serving_prio != previous_prio);
>> - Â Â st = service_tree_for(cfqg, cfqd->serving_prio, cfqd->serving_type,
>> - Â Â Â Â Â Â Â Â Â Â Â Â Â Â cfqd);
>> + Â Â st = service_tree_for(cfqg, cfqd->serving_prio, cfqd->serving_type,
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â cfqd);
>> Â Â Â count = st->count;
>>
>> Â Â Â /*
>> Â Â Â Â* If priority didn't change, check workload expiration,
>> Â Â Â Â* and that we still have other queues ready
>> Â Â Â Â*/
>> - Â Â if (!prio_changed && count &&
>> + Â Â if (!cfqg_changed && !prio_changed && count &&
>> Â Â Â Â Â !time_after(jiffies, cfqd->workload_expires))
>> Â Â Â Â Â Â Â return;
>>
>> Â Â Â /* otherwise select new workload type */
>> Â Â Â cfqd->serving_type =
>> - Â Â Â Â Â Â cfq_choose_wl(cfqd, cfqg, cfqd->serving_prio, prio_changed);
>> + Â Â Â Â Â Â cfq_choose_wl(cfqd, cfqg, cfqd->serving_prio,
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â cfqg_changed, prio_changed);
>> Â Â Â st = service_tree_for(cfqg, cfqd->serving_prio, cfqd->serving_type,
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â cfqd);
>> Â Â Â count = st->count;
>> @@ -2104,9 +2106,16 @@ static struct cfq_group *cfq_get_next_cfqg(struct cfq_data *cfqd)
>>
>> Âstatic void cfq_choose_cfqg(struct cfq_data *cfqd)
>> Â{
>> + Â Â bool cfqg_changed = false;
>> +
>> + Â Â struct cfq_group *orig_cfqg = cfqd->serving_group;
>> +
>> Â Â Â struct cfq_group *cfqg = cfq_get_next_cfqg(cfqd);
>>
>> - Â Â cfqd->serving_group = cfqg;
>> + Â Â if (orig_cfqg != cfqg) {
>> + Â Â Â Â Â Â cfqg_changed = 1;
>> + Â Â Â Â Â Â cfqd->serving_group = cfqg;
>> + Â Â }
>>
>> Â Â Â /* Restore the workload type data */
>> Â Â Â if (cfqg->saved_workload_slice) {
>> @@ -2114,7 +2123,7 @@ static void cfq_choose_cfqg(struct cfq_data *cfqd)
>> Â Â Â Â Â Â Â cfqd->serving_type = cfqg->saved_workload;
>> Â Â Â Â Â Â Â cfqd->serving_prio = cfqg->saved_serving_prio;
>> Â Â Â }
>> - Â Â choose_service_tree(cfqd, cfqg);
>> + Â Â choose_service_tree(cfqd, cfqg, cfqg_changed);
>> Â}
>>
>> Â/*
>> --
>> 1.5.4.rc3
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at Âhttp://vger.kernel.org/majordomo-info.html
> Please read the FAQ at Âhttp://www.tux.org/lkml/
>



--
__________________________________________________________________________

dott. Corrado Zoccolo mailto:czoccolo@xxxxxxxxx
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
The self-confidence of a warrior is not the self-confidence of the average
man. The average man seeks certainty in the eyes of the onlooker and calls
that self-confidence. The warrior seeks impeccability in his own eyes and
calls that humbleness.
Tales of Power - C. Castaneda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/