Re: [PATCH V2 3/7] sched/deadline: Keep new DL task within root domain's boundary
From: Juri Lelli
Date: Fri Feb 02 2018 - 09:35:35 EST
Hi Mathieu,
On 01/02/18 09:51, Mathieu Poirier wrote:
> When considering to move a task to the DL policy we need to make sure
> the CPUs it is allowed to run on matches the CPUs of the root domains of
> the runqueue it is currently assigned to. Otherwise the task will be
> allowed to roam on CPUs outside of this root domain, something that will
> skew system deadline statistics and potentially lead to over selling DL
> bandwidth.
>
> For example say we have a 4 core system split in 2 cpuset: set1 has CPU 0
> and 1 while set2 has CPU 2 and 3. This results in 3 cpuset - the default
> set that has all 4 CPUs along with set1 and set2 as just depicted. We also
> have task A that hasn't been assigned to any CPUset and as such, is part of
> the default CPUset.
>
> At the time we want to move task A to a DL policy it has been assigned to
> CPU1. Since CPU1 is part of set1 the root domain will have 2 CPUs in it
> and the bandwidth constraint checked against the current DL bandwidth
> allotment of those 2 CPUs.
Wait.. I'm confused. :)
Do you disabled cpuset.sched_load_balance in the root (default) cpuset?
If yes, we would end up with 2 root domains and if task A happens to be
on root domain (0-1) checking its admission against 2 CPUs looks like
the right thing to do to me. If no, then there is a single root domain
(the root/deafult one) with 4 CPUs, and it indeed seems that we've
probably got a problem: it is possible for a DEADLINE task running on
root/default cpuset to be put in (for example) 0-1 cpuset, and so
restrict its affinity. Is it this that this patch cures?
Anyway, see more comments below..
[...]
> /*
> + * If setscheduling to SCHED_DEADLINE we need to make sure the task
> + * is constrained to run within the root domain it is associated with,
> + * something that isn't guaranteed when using cpusets.
> + *
> + * Speaking of cpusets, we also need to assert that a task's
> + * cpus_allowed mask equals its cpuset's cpus_allowed mask. Otherwise
> + * a DL task could be assigned to a cpuset that has more CPUs than the
> + * root domain it is associated with, a situation that yields no
> + * benefits and greatly complicate the management of DL task when
> + * cpusets are present.
> + */
> + if (dl_policy(policy)) {
> + struct root_domain *rd = cpu_rq(task_cpu(p))->rd;
I fear root_domain doesn't exist on UP.
Maybe this logic can be put above changing the check we already do
against the span?
https://elixir.free-electrons.com/linux/latest/source/kernel/sched/core.c#L4174
Best,
- Juri