Re: [PATCH] sched/fair: Fix the wrong throttled clock time for cfs_rq_clock_task()

From: Xunlei Pang
Date: Wed May 11 2016 - 23:37:03 EST


On 2016/05/11 at 14:49, Peter Zijlstra wrote:
> On Tue, May 10, 2016 at 11:19:44AM -0700, bsegall@xxxxxxxxxx wrote:
>> Xunlei Pang <xlpang@xxxxxxxxxx> writes:
>>
>>> Two minor fixes for cfs_rq_clock_task().
>>> 1) If cfs_rq is currently being throttled, we need to subtract the cfs
>>> throttled clock time.
>>>
>>> 2) Make "throttled_clock_task_time" update SMP unrelated. Now UP cases
>>> need it as well.
>>>
>>> Signed-off-by: Xunlei Pang <xlpang@xxxxxxxxxx>
>>> ---
>>> kernel/sched/fair.c | 4 +---
>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 1708729e..fb80a12 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -3655,7 +3655,7 @@ static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
>>> static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq)
>>> {
>>> if (unlikely(cfs_rq->throttle_count))
>>> - return cfs_rq->throttled_clock_task;
>>> + return cfs_rq->throttled_clock_task - cfs_rq->throttled_clock_task_time;
>>>
>>> return rq_clock_task(rq_of(cfs_rq)) - cfs_rq->throttled_clock_task_time;
>>> }
> The alternative is obviously to do the subtraction in
> tg_throttle_down(), were we set ->throttled_clock_task.

It is possible, but throttled_clock_task is a timestamp, I think doing it here is semantically better.

>
>>> @@ -3793,13 +3793,11 @@ static int tg_unthrottle_up(struct task_group *tg, void *data)
>>> struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
>>>
>>> cfs_rq->throttle_count--;
>>> -#ifdef CONFIG_SMP
>>> if (!cfs_rq->throttle_count) {
>>> /* adjust cfs_rq_clock_task() */
>>> cfs_rq->throttled_clock_task_time += rq_clock_task(rq) -
>>> cfs_rq->throttled_clock_task;
>>> }
>>> -#endif
>>>
>>> return 0;
>>> }
>> [Cc: pjt@xxxxxxxxxx]
>>
>> This looks reasonable to me (at least the first part; I'm not
>> certain why the CONFIG_SMP ifdef was put in place).
> 64660c864f46 ("sched: Prevent interactions with throttled entities")
>
> Introduced it, because at that time it was about updating shares, which
> is only present on SMP. Then:
>
> f1b17280efbd ("sched: Maintain runnable averages across throttled periods")
>
> Added the clock thing inside it, and:
>
> 82958366cfea ("sched: Replace update_shares weight distribution with per-entity computation")
>
> took out the shares update and left the clock update, resulting in the
> current code.
>
>

Thanks,
Xunlei