Hello.
On Mon, Mar 06, 2023 at 03:08:47PM -0500, Waiman Long <longman@xxxxxxxxxx> wrote:
On a system with asymmetric CPUs, a restricted task is one that can runIIUC, cpumask_empty(new_cpus) here implies
only a selected subset of available CPUs. When a CPU goes offline or
when "cpuset.cpus" is changed, it is possible that a restricted task
may not have any runnable CPUs left in the current cpuset even if there
is still some CPUs in effective_cpus. In this case, the restricted task
cannot be run at all.
There are several ways we may be able to handle this situation. Treating
it like empty effective_cpus is probably too disruptive and is unfair to
the normal tasks. So it is better to have some special handling for these
restricted tasks. One possibility is to move the restricted tasks up the
cpuset hierarchy, but it is tricky to do it right. Another solution is
to assign other usable CPUs to these tasks. This patch implements the
later alternative by finding one usable CPU by walking up the cpuset
hierarchy and printing an informational message to let the users know
that these restricted tasks are running in a cpuset with no usable CPU.
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
kernel/cgroup/cpuset.c | 56 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 55 insertions(+), 1 deletion(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index bbf57dcb2f68..aa8225daf1d3 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1202,6 +1202,38 @@ void rebuild_sched_domains(void)
cpus_read_unlock();
}
[...]
/**
* update_tasks_cpumask - Update the cpumasks of tasks in the cpuset.
* @cs: the cpuset in which each task's cpus_allowed mask needs to be changed
@@ -1218,6 +1250,7 @@ static void update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus)
struct task_struct *task;
bool top_cs = cs == &top_cpuset;
+ percpu_rwsem_assert_held(&cpuset_rwsem);
css_task_iter_start(&cs->css, 0, &it);
while ((task = css_task_iter_next(&it))) {
const struct cpumask *possible_mask = task_cpu_possible_mask(task);
@@ -1232,7 +1265,28 @@ static void update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus)
} else {
cpumask_and(new_cpus, cs->effective_cpus, possible_mask);
}
- set_cpus_allowed_ptr(task, new_cpus);
+ /*
+ * On systems with assymetric CPUs, it is possible that
+ * cpumask will become empty or set_cpus_allowed_ptr() will
+ * return an error even if we still have CPUs in
+ * effective_cpus. In this case, we find a usable CPU walking
+ * up the cpuset hierarchy and use that for this particular
+ * task with an informational message about the change in the
+ * hope that the users will adjust "cpuset.cpus" accordingly.
+ */
+ if (cpumask_empty(new_cpus) ||
+ set_cpus_allowed_ptr(task, new_cpus)) {
cpumask_empty(cs->effective_cpus) but that shouldn't happen (cs should
inherit non-empty mask from an ancestor). Do I miss/forget anything?
This thus covers the case when p->user_cpus_ptr is incompatible with
hotplug or cpuset.cpus allowance and a different affinity must be
chosen. But doesn't that mean that the task would run _out_ of
cs->effective_cpus?
I guess that's unavoidable on asymmetric CPU archs but not no SMPs.
Shouldn't the solution distinguish between the two? (I.e. never run out
of effective_cpus on SMP.)