Re: [PATCH 0/8] sched/deadline: Return the best satisfying affinity and dl in cpudl_find

From: Juri Lelli
Date: Tue Mar 28 2017 - 03:12:08 EST


On 28/03/17 09:42, Byungchul Park wrote:
> On Mon, Mar 27, 2017 at 03:05:07PM +0100, Juri Lelli wrote:
> > Hi,
> >
> > On 23/03/17 19:32, Byungchul Park wrote:
> > > cpudl_find() is used to find a cpu having the latest dl. The function
> > > should return the latest cpu among ones satisfying task's affinity and
> > > dl constraint, but current code gives up immediately and just return
> > > fail when it fails at the test *only with* the maximum cpu.
> > >
> > > For example:
> > >
> > > cpu 0 is running a task (dl: 10).
> > > cpu 1 is running a task (dl: 9).
> > > cpu 2 is running a task (dl: 8).
> > > cpu 3 is running a task (dl: 2).
> > >
> > > where cpu 3 want to push a task (affinity is 1 2 3 and dl is 1).
> >
> > Hummm, but this should only happen if you disable admission control,
> > right? Otherwise task's affinity can't be smaller that 0-3.
>
> Hi Juri,
>
> Can I ask you what is addmission control? Do you mean affinity setting?

sched_setattr() for DEADLINE tasks peforms a set of checks before
admitting the task to the system. Please have a look at Documentation/
scheduler/sched-deadline.txt::Section5 for what concerns affinity.

> And do you mean s/disable/enable? Or am I misunderstanding?
>

No, I meant disable. The problem is that if you disable admission
control the problem you are pointing out can happen, if admission
control is enabled otherwise it can't, as we enforce that tasks have
affinity equal to the root_domain span to which they belong. E.g, in
your case the task will have affinity set to 0-3 (or it won't be able to
enter the system), so that would make the problem go away.

> > >
> > > In this case, the task should be migrated from cpu 3 to cpu 1, and
> > > preempt cpu 1's task. However, current code just returns fail because
> > > it fails at the affinity test with the maximum cpu, that is, cpu 0.
> > >
> > > This patch set tries to find the best among ones satisfying task's
> > > affinity and dl constraint until success or no more to see.
> > >
> >
> > Anyway, do you have numbers showing how common is you fail scenario?
>
> Actually, it very depends on how to set test environment. I can provide
> you ones which generate many fails. IMHO, it's not a matter of frequency
> but a matter of whether it works corrently. As you know, rt policy already
> works corrently regarding this problem.
>

Right. But, my point is that if what you are highlighting turns out to
be a pretty frequent situation, maybe we need to find a better data
structure to speed up push operations or we will end up using the slow
path most of the times, making the heap useless.

> In other words, if there are dl tasks in a system like:
>
> task a (dl: 1) -+ -+
> task b (dl: 2) -| -|
> task c (dl: 3) -| -|
> task d (dl: 4) -| -+- should be run on 4 cpus machine
> task e (dl: 5) -|
> task f (dl: 6) -|
> task g (dl: 7) -|
> task h (dl: 8) -+- should be run on 8 cpus machine
> task i (dl: 9)
> task j (dl: 10)
>
> IMHO, deadline scheduler should ensure most urgent tasks as many as the
> number of cpus in the system to be run, as long as their affinities are
> satisfied. What do you think about this?
>

Correct. But please read above for what regards affinities.