Re: [PATCH] sched/fair: cfs quota cause large schedule latency

From: Peter Zijlstra
Date: Fri Jul 20 2018 - 09:16:06 EST


On Mon, Jul 16, 2018 at 07:08:41AM +0000, Xiexiangyou wrote:
> Virtual machine has cgroup hierarchies as follow:
>
> root
>
> |
>
> vm_tg
>
> (cfs_rq)
>
> / \
>
> (se) (se)
>
> tg_A tg_B
>
> (cfs_rq) (cfs_rq)
>
> / \
>
> (se) (se)
>
> a b
>
> A and B are two vcpus of the VM.
>
>
>
> We set cfs quota on vm_tg, and the schedule latency of vcpu(a/b) may become very large, up to more than 2S.
>
>
>
> Shows Perf sched test result:
>
> Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
>
> -----------------------------------------------------------------------------------------------------------------
>
> CPU 0/KVM:49609 | 260.261 ms | 50 | avg: 82.017 ms | max: 2510.990 ms | max at: 43335.555886 s
>
> .....
>
>
>
> We add some trace points, found the sequence as follows will lead to the issue:
>
> - 'a' is only task of tg_A, when 'a' go to sleep, tg_A is dequeued, and tg_A->se->load.weight = MIN_SHARES.
>
> - 'b' continue running, then trigger throttle. tg_A->cfs_rq->throttle_count=1
>
> - some task wakeup process 'a', When enqueue tg_A, tg_A->se->load.weight can't be updated because tg_A->cfs_rq->throttle_count=1
>
> - after one cfs quota period, vm_tg is unthrottled
>
> - 'a' is running
>
> - after one tick, when update tg_A->se's vruntime, tg_A->se->load.weight is still MIN_SHARES, lead tg_A->se's vruntime has grown a large value.
>
> - That will cause 'a' to have a large schedule latancy.
>
> The fix patch as follows:
>
> Signed-off-by: Xiangyou Xie <xiexiangyou@xxxxxxxxxx<mailto:xiexiangyou@xxxxxxxxxx>>

The above Changelog violates just about every formatting rule ever
invented. Also you got your email format wrong.

The patch might be OK, but at this point I really can't do anything with
it anyway.

> ---
> kernel/sched/fair.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2f0a0be..348ccd6 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3016,9 +3016,6 @@ static void update_cfs_group(struct sched_entity *se)
> if (!gcfs_rq)
> return;
>
> - if (throttled_hierarchy(gcfs_rq))
> - return;
> -
> #ifndef CONFIG_SMP
> runnable = shares = READ_ONCE(gcfs_rq->tg->shares);
>
> --
> 1.8.3.1
>