Re: [PATCH 0/7] sched/deadline: fix cpusets bandwidth accounting

From: Mathieu Poirier
Date: Fri Aug 25 2017 - 16:36:06 EST


On 25 August 2017 at 03:52, Luca Abeni <luca.abeni@xxxxxxxxxxxxxxx> wrote:
> On Fri, 25 Aug 2017 08:02:43 +0200
> luca abeni <luca.abeni@xxxxxxxxxxxxxxx> wrote:
> [...]
>> > The above demonstrate that even if we have two CPUsets new task belong
>> > to the "default" CPUset and as such can use all the available CPUs.
>>
>> I still have a doubt (probably showing all my ignorance about
>> CPUsets :)... In this situation, we have 3 CPUsets: "default",
>> set1, and set2... Is everyone of these CPUsets associated to a
>> root domain (so, we have 3 root domains)? Or only set1 and set2 are
>> associated to a root domain?
>
> Ok, after reading (and hopefully understanding better :) the code, I
> think this question was kind of silly... There are only 2 root domains,
> corresponding to set1 and set2 (right?).

For this scenario yes, you are correct.

>
> [...]
>
>> > So above we'd run the acceptance test on root
>> > domain A and B before promoting the task. Of course we'd also have to
>> > add the utilisation of that task to both root domain. Although simple
>> > it goes at the core of the DL scheduler and touches pretty much every
>> > aspect of it, something I'm reluctant to embark on.
>>
>> I see... So, the "default" CPUset does not have any root domain
>> associated to it? If it had, we could just subtract the maximum
>> utilizations of set1 and set2 to it when creating the root domains of
>> set1 and set2.
> ...
> So, this idea of mine had no sense.
>
> I think the correct solution is what you implemented in your patchset
> (if I understand it correctly).
>
> If we want to have task spanning multiple root domains, many more
> changes in the code are needed... I am wondering if it would make more
> sense to track utilizations per runqueue (instead of per root domain):
> - when a task tries to become SCHED_DEADLINE, we count how many CPUs are
> in its affinity mask. Let's call "n" this number
> - then, we sum u / n (where "u" is the task's utilization) to the
> utilization of every runqueue that is in its affinity mask, and we
> check if all the sums are below the schedulability bound
>
> For tasks spanning one single root domain, this should be equivalent to
> the current admission test. Moreover, this check should ensure that no
> root domain can be ever overloaded (even if tasks span multiple
> domains).
> But I do not know the locking implications for this idea... I suspect
> it will not scale :(
>
>
>
> Luca