Re: [PATCH] sched/fair: Fix the wrong throttled clock time for cfs_rq_clock_task()
From: Peter Zijlstra
Date: Wed May 11 2016 - 02:49:41 EST
On Tue, May 10, 2016 at 11:19:44AM -0700, bsegall@xxxxxxxxxx wrote:
> Xunlei Pang <xlpang@xxxxxxxxxx> writes:
>
> > Two minor fixes for cfs_rq_clock_task().
> > 1) If cfs_rq is currently being throttled, we need to subtract the cfs
> > throttled clock time.
> >
> > 2) Make "throttled_clock_task_time" update SMP unrelated. Now UP cases
> > need it as well.
> >
> > Signed-off-by: Xunlei Pang <xlpang@xxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 4 +---
> > 1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 1708729e..fb80a12 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3655,7 +3655,7 @@ static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
> > static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq)
> > {
> > if (unlikely(cfs_rq->throttle_count))
> > - return cfs_rq->throttled_clock_task;
> > + return cfs_rq->throttled_clock_task - cfs_rq->throttled_clock_task_time;
> >
> > return rq_clock_task(rq_of(cfs_rq)) - cfs_rq->throttled_clock_task_time;
> > }
The alternative is obviously to do the subtraction in
tg_throttle_down(), were we set ->throttled_clock_task.
> > @@ -3793,13 +3793,11 @@ static int tg_unthrottle_up(struct task_group *tg, void *data)
> > struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
> >
> > cfs_rq->throttle_count--;
> > -#ifdef CONFIG_SMP
> > if (!cfs_rq->throttle_count) {
> > /* adjust cfs_rq_clock_task() */
> > cfs_rq->throttled_clock_task_time += rq_clock_task(rq) -
> > cfs_rq->throttled_clock_task;
> > }
> > -#endif
> >
> > return 0;
> > }
>
> [Cc: pjt@xxxxxxxxxx]
>
> This looks reasonable to me (at least the first part; I'm not
> certain why the CONFIG_SMP ifdef was put in place).
64660c864f46 ("sched: Prevent interactions with throttled entities")
Introduced it, because at that time it was about updating shares, which
is only present on SMP. Then:
f1b17280efbd ("sched: Maintain runnable averages across throttled periods")
Added the clock thing inside it, and:
82958366cfea ("sched: Replace update_shares weight distribution with per-entity computation")
took out the shares update and left the clock update, resulting in the
current code.