Re: [PATCH 3/2] sched/deadline: Use deadline instead of period when calculating overflow

From: Luca Abeni
Date: Wed Feb 15 2017 - 08:13:13 EST


On Wed, 15 Feb 2017 12:59:25 +0000
Juri Lelli <juri.lelli@xxxxxxx> wrote:

> On 15/02/17 13:31, Luca Abeni wrote:
> > Hi Juri,
> >
> > On Wed, 15 Feb 2017 10:29:19 +0000
> > Juri Lelli <juri.lelli@xxxxxxx> wrote:
> > [...]
> > > > Ok, thanks; I think I can now see why this can result in a task
> > > > consuming more than the reserved utilisation. I still need some
> > > > time to convince me that "runtime / (deadline - t) >
> > > > dl_runtime / dl_deadline" is the correct check to use (in this
> > > > case, shouldn't we also change the admission test to use
> > > > densities instead of utilisations?)
> > >
> > > Right, this is what I was wondering as well, as dl_overflow()
> > > currently looks at the period. And I also have some recollection
> > > of this discussion happening already in the past, unfortunately
> > > it was not on the list.
> > >
> > > That discussion started with the following patch
> > [...]
> > > that we then dediced not to propose since (note that these are
> > > just my memories of the dicussion, so everything it's up for
> > > further discussion, also in light of the problem highlighted by
> > > Daniel)
> > >
> > > - SCHED_DEADLINE, as the documentation says, does AC using
> > > utilization
> > > - it is however true that a sufficient (but not necessary) test
> > > on UP for D_i != P_i cases is the one of my patch above
> > > - we have agreed in the past that the kernel should only check
> > > that we don't cause "overload" in the system (which is still the
> > > case if we consider utilizations), not "hard schedulability"
> > I remember a similar discussion; I think the decision about what to
> > do depends on what are the requirements: hard deadline guarantees
> > (but in this case global EDF is just a bad choice) or tardines no
> > overload guarantees?
> >
> > My understanding was that the kernel guarantees that deadline tasks
> > will not starve non-deadline tasks, and that there is an upper bound
> > for the tardiness experienced by deadline tasks. If this
> > understanding is correct, then the current admission test is ok.
> > But if I misunderstood the purpose of the kernel admission test,
> > then maybe your patch is ok.
> >
> > Then, it is important to keep the admission test consistent with the
> > checks performed in dl_entity_overflow() (but whatever we decide to
> > do, dl_entity_overflow() should be fixed).
> >
>
> I'm sorry, but I'm a bit lost. :(
>
> Why do you say 'whatever we decide to do'?
>
> In my understanding:
>
> - if we decide AC shouldn't change (as we care about not-starving
> others and having bounded tardiness), then I'd say
> dl_entity_overflow shouldn't change as well, since it's using
> dl_runtime/dl_period as 'static bandwidth' (as AC does)

Yes, but it is comparing dl_runtime/dl_period with
runtime/(deadline-t), mixing different things. I still need to think
more about this, but I think it should either compare
runtime/(deadline-t) with dl_runtime/dl_deadline or
runtime/(end_of_reservation_period-t) with dl_runtime/dl_period...
Otherwise we risk to have issues as shown by Daniel and Steven.


> - if we instead move to use densities when doing AC (dl_runtime/dl_
> deadline), I think we should also change the check in dl_entity_
> overflow, as Steve is proposing
>
> - in both cases Daniel's fixes look sensible to have
Yes, Daniel's fixes fix a possible DoS, so they should go in... Then,
we can decide how to improve the situation.

>
> Where am I wrong? :)
>
> Actually, another thing that we noticed, talking on IRC with Peter, is
> that we seem to be replenishing differently on different occasions:
>
> - on wakeup (if overflowing) we do
>
> dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline;
> dl_se->runtime = pi_se->dl_runtime;
>
> - when the replenishment timer fires (un-thottle and with runtime <
> 0)
>
> dl_se->deadline += pi_se->dl_period;
> dl_se->runtime += pi_se->dl_runtime;
>
> Isn't this problematic as well?
I _think_ this is correct, because they are two different things: in
the first case, we generate a new scheduling deadline starting from
current time (so, the deadline must be computed based on the relative
deadline); in the second case, we postpone an existing scheduling
deadline (so, it must be postponed by one period)[*]... No? Or am I
misunderstanding the issue you saw?


Thanks,
Luca

[*] Notice that with Daniel's fix the replenishment timer fires at the
end of the reservation period (or, at the beginning of a new
reservation period). So, "current time + dl_deadline" is about equal to
"deadline + period" (but using "current time + dl_deadline" would
generate larger deadlines if the timer fires later than expected).