Re: [RFC][PATCH 12/13] sched/deadline: Introduce deadline servers
From: Juri Lelli
Date: Thu Aug 08 2019 - 05:27:51 EST
On 08/08/19 10:57, Dietmar Eggemann wrote:
> On 8/8/19 10:46 AM, Juri Lelli wrote:
> > On 08/08/19 10:11, Dietmar Eggemann wrote:
> >> On 8/8/19 9:56 AM, Peter Zijlstra wrote:
> >>> On Wed, Aug 07, 2019 at 06:31:59PM +0200, Dietmar Eggemann wrote:
> >>>> On 7/26/19 4:54 PM, Peter Zijlstra wrote:
> >>>>>
> >>>>>
> >>>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> >>>>
> >>>> [...]
> >>>>
> >>>>> @@ -889,6 +891,8 @@ static void update_curr(struct cfs_rq *c
> >>>>> trace_sched_stat_runtime(curtask, delta_exec, curr->vruntime);
> >>>>> cgroup_account_cputime(curtask, delta_exec);
> >>>>> account_group_exec_runtime(curtask, delta_exec);
> >>>>> + if (curtask->server)
> >>>>> + dl_server_update(curtask->server, delta_exec);
> >>>>> }
> >>>>
> >>>> I get a lockdep_assert_held(&rq->lock) related warning in start_dl_timer()
> >>>> when running the full stack.
> >>>
> >>> That would seem to imply a stale curtask->server value; the hunk below:
> >>>
> >>> --- a/kernel/sched/core.c
> >>> +++ b/kernel/sched/core.c
> >>> @@ -3756,8 +3756,11 @@ pick_next_task(struct rq *rq, struct tas
> >>>
> >>> for_each_class(class) {
> >>> p = class->pick_next_task(rq, NULL, NULL);
> >>> - if (p)
> >>> + if (p) {
> >>> + if (p->sched_class == class && p->server)
> >>> + p->server = NULL;
> >>> return p;
> >>> + }
> >>> }
> >>>
> >>>
> >>> Was supposed to clear p->server, but clearly something is going 'funny'.
> >>
> >> What about the fast path in pick_next_task()?
> >>
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index bffe849b5a42..f1ea6ae16052 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -3742,6 +3742,9 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> >> if (unlikely(!p))
> >> p = idle_sched_class.pick_next_task(rq, prev, rf);
> >>
> >> + if (p->sched_class == &fair_sched_class && p->server)
> >> + p->server = NULL;
> >> +
> >
> > Hummm, but then who sets it back to the correct server. AFAIU
> > update_curr() needs a ->server to do the correct DL accounting?
>
> Ah, OK, this would kill the whole functionality ;-)
>
I'm thinking we could use &rq->fair_server. It seems to pass the point
we are discussing about, but then virt box becomes unresponsive (busy
loops).