[PATCH 3/7] sched/deadline: Keep new DL task within root domain's boundary

From: Mathieu Poirier
Date: Wed Aug 16 2017 - 17:21:03 EST


When considering to move a task to the DL policy we need to make sure
the CPUs it is allowed to run on matches the CPUs of the root domains of
the runqueue it is currently assigned to. Otherwise the task will be
allowed to roam on CPUs outside of this root domain, something that will
skew system deadline statistics and potentially lead to over selling DL
bandwidth.

For example say we have a 4 core system split in 2 cpuset: set1 has CPU 0
and 1 while set2 has CPU 2 and 3. This results in 3 cpuset - the default
set that has all 4 CPUs along with set1 and set2 as just depicted. We also
have task A that hasn't been assigned to any CPUset and as such, is part of
the default CPUset.

At the time we want to move task A to a DL policy it has been assigned to
CPU1. Since CPU1 is part of set1 the root domain will have 2 CPUs in it
and the bandwidth constraint checked against the current DL bandwidth
allotment of those 2 CPUs.

If task A is promoted to a DL policy it's 'cpus_allowed' mask is still
equal to the CPUs in the default CPUset, making it possible for the
scheduler to move it to CPU2 and CPU3, which could also be running DL tasks
of their own.

This patch makes sure that a task's cpus_allowed mask matches the CPUs
in the root domain associated to the runqueue it has been assigned to.

Signed-off-by: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
---
kernel/sched/deadline.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index ba64a5b8f40b..2c0405d74367 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2474,6 +2474,7 @@ int sched_dl_overflow(struct task_struct *p, int policy,
const struct sched_attr *attr)
{
struct dl_bw *dl_b = dl_bw_of(task_cpu(p));
+ struct root_domain *rd = cpu_rq(task_cpu(p))->rd;
u64 period = attr->sched_period ?: attr->sched_deadline;
u64 runtime = attr->sched_runtime;
u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0;
@@ -2484,6 +2485,19 @@ int sched_dl_overflow(struct task_struct *p, int policy,
return 0;

/*
+ * By default a task is set to run on all the CPUs the system
+ * knows about. This is fine for as long as we don't deal with cpusets
+ * where runqueues are split between root domains. The computation
+ * below is based on root domain information, as such the task must be
+ * constrained to run within that root domain. It is the user's
+ * responsability to constrain the task to specific CPUs by either
+ * assigning the task to a cpuset or run the taskset utility. Here we
+ * simply make sure things are coherent.
+ */
+ if (!cpumask_equal(&p->cpus_allowed, rd->span))
+ goto out;
+
+ /*
* Either if a task, enters, leave, or stays -deadline but changes
* its parameters, we may need to update accordingly the total
* allocated bandwidth of the container.
@@ -2518,7 +2532,7 @@ int sched_dl_overflow(struct task_struct *p, int policy,
err = 0;
}
raw_spin_unlock(&dl_b->lock);
-
+out:
return err;
}

--
2.7.4