Re: [PATCH] sched/fair: fix unthrottle_cfs_rq for leaf_cfs_rq list
From: bsegall
Date: Tue May 12 2020 - 14:59:14 EST
Vincent Guittot <vincent.guittot@xxxxxxxxxx> writes:
> Although not exactly identical, unthrottle_cfs_rq() and enqueue_task_fair()
> are quite close and follow the same sequence for enqueuing an entity in the
> cfs hierarchy. Modify unthrottle_cfs_rq() to use the same pattern as
> enqueue_task_fair(). This fixes a problem already faced with the latter and
> add an optimization in the last for_each_sched_entity loop.
>
> Fixes: fe61468b2cb (sched/fair: Fix enqueue_task_fair warning)
> Reported-by Tao Zhou <zohooouoto@xxxxxxxxxxx>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> ---
>
> This path applies on top of 20200507203612.GF19331@xxxxxxxxxxxxxxxxxxxxxxxxx
> and fixes similar problem for unthrottle_cfs_rq()
>
> kernel/sched/fair.c | 37 ++++++++++++++++++++++++++++---------
> 1 file changed, 28 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e2450c2e0747..4b73518aa25c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4803,26 +4803,44 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> idle_task_delta = cfs_rq->idle_h_nr_running;
> for_each_sched_entity(se) {
> if (se->on_rq)
> - enqueue = 0;
> + break;
> + cfs_rq = cfs_rq_of(se);
> + enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
>
> + cfs_rq->h_nr_running += task_delta;
> + cfs_rq->idle_h_nr_running += idle_task_delta;
> +
> + /* end evaluation on encountering a throttled cfs_rq */
> + if (cfs_rq_throttled(cfs_rq))
> + goto unthrottle_throttle;
> + }
> +
> + for_each_sched_entity(se) {
> cfs_rq = cfs_rq_of(se);
> - if (enqueue) {
> - enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
> - } else {
> - update_load_avg(cfs_rq, se, 0);
> - se_update_runnable(se);
> - }
> +
> + update_load_avg(cfs_rq, se, UPDATE_TG);
> + se_update_runnable(se);
>
> cfs_rq->h_nr_running += task_delta;
> cfs_rq->idle_h_nr_running += idle_task_delta;
>
> +
> + /* end evaluation on encountering a throttled cfs_rq */
> if (cfs_rq_throttled(cfs_rq))
> - break;
> + goto unthrottle_throttle;
> +
> + /*
> + * One parent has been throttled and cfs_rq removed from the
> + * list. Add it back to not break the leaf list.
> + */
> + if (throttled_hierarchy(cfs_rq))
> + list_add_leaf_cfs_rq(cfs_rq);
> }
>
> if (!se)
The if is no longer necessary, unlike in enqueue, where the skip goto
goes to this if statement rather than past (but enqueue could be changed
to match this). Also in general if we are making these loops absolutely
identical we should probably pull them out to a common function (ideally
including the goto target and following loop as well).
> add_nr_running(rq, task_delta);
>
> +unthrottle_throttle:
> /*
> * The cfs_rq_throttled() breaks in the above iteration can result in
> * incomplete leaf list maintenance, resulting in triggering the
> @@ -4831,7 +4849,8 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> for_each_sched_entity(se) {
> cfs_rq = cfs_rq_of(se);
>
> - list_add_leaf_cfs_rq(cfs_rq);
> + if (list_add_leaf_cfs_rq(cfs_rq))
> + break;
Do we also need to handle the case of tg_unthrottle_up followed by early exit
from unthrottle_cfs_rq? I do not have enough of an idea what
list_add_leaf_cfs_rq is doing to say.
> }
>
> assert_list_leaf_cfs_rq(rq);