Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth
From: Juri Lelli
Date: Wed Feb 10 2016 - 08:42:00 EST
On 10/02/16 13:48, Luca Abeni wrote:
> Hi,
>
> On Wed, 10 Feb 2016 11:32:58 +0000
> Juri Lelli <juri.lelli@xxxxxxx> wrote:
> [...]
> > From 62f70ca3051672dce209e8355cf5eddc9d825c2a Mon Sep 17 00:00:00 2001
> > From: Juri Lelli <juri.lelli@xxxxxxx>
> > Date: Sat, 6 Feb 2016 12:41:09 +0000
> > Subject: [PATCH 1/2] sched/deadline: add per rq tracking of admitted
> > bandwidth
> >
> > Currently SCHED_DEADLINE scheduling policy tracks bandwidth of tasks
> > that passed admission control at root_domain level only. This creates
> > problems when such data structure(s) are destroyed, when we
> > reconfigure scheduling domains for example.
> >
> > This is part one of two changes required to fix the problem. In this
> > patch we add per-rq tracking of admitted bandwidth. Tasks bring with
> > them their bandwidth contribution when they enter the system and are
> > enqueued for the first time. Contributions are then moved around when
> > migrations happen and removed when tasks die.
>
> I think this patch actually does two different things (addressing two
> separate problems):
> 1) it introduces the tracking of per-rq utilization (used in your next
> patch to address the root domain issues)
> 2) it fixes a bug in the current utilization tracking mechanism.
> Currently, a task doing
> while(1) {
> switch to SCHED_DEADLINE
> switch to SCHED_OTHER
> }
> brings dl_b->total_bw below 0. Thanks to Juri for showing me this
> problem (and how to reproduce it) in a private email.
> This happens because when the task switches back from SCHED_DEADLINE
> to SCHED_OTHER, switched_from_dl() does not clear its deadline
> parameters (they will be cleared by the deadline timer when it
> fires). But dl_overflow() removes its utilization from
> dl_b->total_bw. When the task switches back to SCHED_DEADLINE, the
> if (new_bw == p->dl.dl_bw) check in dl_overflow() prevents
> __dl_add() from being called, and so when the task switches back to
> SCHED_OTHER dl_b->total_bw becomes negative.
>
Yep. That is also the reason why I think, whatever refactoring we are
goind to do, we should keep per-rq and root_domain accounting done
together.
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 9503d59..0ee0ec2 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2432,7 +2432,7 @@ static int dl_overflow(struct task_struct *p,
> > int policy, u64 new_bw = dl_policy(policy) ? to_ratio(period,
> > runtime) : 0; int cpus, err = -1;
> >
> > - if (new_bw == p->dl.dl_bw)
> > + if (task_has_dl_policy(p) && new_bw == p->dl.dl_bw)
> > return 0;
>
> This hunk actually fixes issue 2) mentioned above, so I think it should
> be committed in a short time (independently from the rest of the
> patch). And maybe is a good candidate for backporting to stable kernels?
>
Yes, this is a sensible fix per se. I can split it and send it
separately.
Best,
- Juri