Re: [RFC][PATCH 12/13] sched/deadline: Introduce deadline servers

From: Dietmar Eggemann
Date: Thu Aug 08 2019 - 04:57:18 EST


On 8/8/19 10:46 AM, Juri Lelli wrote:
> On 08/08/19 10:11, Dietmar Eggemann wrote:
>> On 8/8/19 9:56 AM, Peter Zijlstra wrote:
>>> On Wed, Aug 07, 2019 at 06:31:59PM +0200, Dietmar Eggemann wrote:
>>>> On 7/26/19 4:54 PM, Peter Zijlstra wrote:
>>>>>
>>>>>
>>>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>>>>
>>>> [...]
>>>>
>>>>> @@ -889,6 +891,8 @@ static void update_curr(struct cfs_rq *c
>>>>> trace_sched_stat_runtime(curtask, delta_exec, curr->vruntime);
>>>>> cgroup_account_cputime(curtask, delta_exec);
>>>>> account_group_exec_runtime(curtask, delta_exec);
>>>>> + if (curtask->server)
>>>>> + dl_server_update(curtask->server, delta_exec);
>>>>> }
>>>>
>>>> I get a lockdep_assert_held(&rq->lock) related warning in start_dl_timer()
>>>> when running the full stack.
>>>
>>> That would seem to imply a stale curtask->server value; the hunk below:
>>>
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -3756,8 +3756,11 @@ pick_next_task(struct rq *rq, struct tas
>>>
>>> for_each_class(class) {
>>> p = class->pick_next_task(rq, NULL, NULL);
>>> - if (p)
>>> + if (p) {
>>> + if (p->sched_class == class && p->server)
>>> + p->server = NULL;
>>> return p;
>>> + }
>>> }
>>>
>>>
>>> Was supposed to clear p->server, but clearly something is going 'funny'.
>>
>> What about the fast path in pick_next_task()?
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index bffe849b5a42..f1ea6ae16052 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -3742,6 +3742,9 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
>> if (unlikely(!p))
>> p = idle_sched_class.pick_next_task(rq, prev, rf);
>>
>> + if (p->sched_class == &fair_sched_class && p->server)
>> + p->server = NULL;
>> +
>
> Hummm, but then who sets it back to the correct server. AFAIU
> update_curr() needs a ->server to do the correct DL accounting?

Ah, OK, this would kill the whole functionality ;-)