Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c

From: Peter Zijlstra
Date: Tue Jul 22 2014 - 05:47:51 EST


On Mon, Jul 21, 2014 at 06:52:12PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 21, 2014 at 11:35:28AM -0500, Bruno Wolff III wrote:
> > Is there more I can do to help with this now? Or should I just wait for
> > patches to test?
>
> Yeah, sorry, was wiped out today. I'll go stare harder at the P4
> topology setup code tomorrow. Something fishy there.

Does this make your machine boot again (while giving an error)?

It tries to robustify the topology setup a bit, crashing on crap input
should be avoided if possible of course.

I'll go stare at the x86/P4 topology code like promised.

---
Subject: sched: Robustify topology setup
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Mon Jul 21 23:07:06 CEST 2014

We hard assume that higher topology levels are strict supersets of
lower levels.

Detect, warn and try to fixup when we encounter this violated.

Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/n/tip-cgp9j2tk0qnunhtpps3udsom@xxxxxxxxxxxxxx
---
kernel/sched/core.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6480,6 +6480,20 @@ struct sched_domain *build_sched_domain(
sched_domain_level_max = max(sched_domain_level_max, sd->level);
child->parent = sd;
sd->child = child;
+
+ if (!cpumask_subset(sched_domain_span(child),
+ sched_domain_span(sd))) {
+ pr_err("BUG: arch topology borken\n");
+#ifdef CONFIG_SCHED_DEBUG
+ pr_err(" the %s domain not a subset of the %s domain\n",
+ child->name, sd->name);
+#endif
+ /* Fixup, ensure @sd has at least @child cpus. */
+ cpumask_or(sched_domain_span(sd),
+ sched_domain_span(sd),
+ sched_domain_span(child));
+ }
+
}
set_domain_attribute(sd, attr);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/