Re: Random panic in load_balance() with 3.16-rc

From: Linus Torvalds
Date: Wed Jul 23 2014 - 12:54:30 EST


On Wed, Jul 23, 2014 at 8:55 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>
>> I haven't seen the full oops, can you forward the screenshot? The
>> exact register state might give some clues.
>
> Sure, here goes.

So the length is fine, and the disassembly shows that it is fixed (16
32-bit words - why the heck does it use "movsl" rather than "movsq",
whatever).

The problem is %rdi, which has the value ffff10043c803e8c, which isn't
canonical. Which is why it GP-faults.

That value is loaded from the stack:

mov -0x88(%rbp),%rdi

so apparently the original "__get_cpu_var(load_balance_mask)" is
already corrupted, or something has corrupted it on the stack since
loading (but that looks unlikely).

And I wonder if I have a clue. Look, load_balance_mask is a
"cpumask_var_t", but I don't see a "alloc_cpumask_var()" for it.
That's broken with CONFIG_CPUMASK_OFFSTACK.

I think you actually want "load_balance_mask" to be a "struct cpumask *", no?

Alternatively, keep it a "cpumask_var_t", but then you need to use
__get_cpu_pointer() to get the address of it, and use
"alloc_cpumask_var()" to allocate area for the OFFSTACK case.

TOTALLY UNTESTED AND PROBABLY PURE CRAP PATCH ATTACHED.

WARNING! WARNING! WARNING! This is just looking at the code, not
really knowing it, and saying "that looks really really wrong". Maybe
I'm full of shit.

Linus
kernel/sched/core.c | 2 +-
kernel/sched/fair.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index bc1638b33449..6980b7ad6da1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6852,7 +6852,7 @@ struct task_group root_task_group;
LIST_HEAD(task_groups);
#endif

-DECLARE_PER_CPU(cpumask_var_t, load_balance_mask);
+DECLARE_PER_CPU(struct cpumask *, load_balance_mask);

void __init sched_init(void)
{
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fea7d3335e1f..ef84a37ba19a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6421,7 +6421,7 @@ static struct rq *find_busiest_queue(struct lb_env *env,
#define MAX_PINNED_INTERVAL 512

/* Working cpumask for load_balance and load_balance_newidle. */
-DEFINE_PER_CPU(cpumask_var_t, load_balance_mask);
+DEFINE_PER_CPU(struct cpumask *, load_balance_mask);

static int need_active_balance(struct lb_env *env)
{
@@ -6490,7 +6490,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
struct sched_group *group;
struct rq *busiest;
unsigned long flags;
- struct cpumask *cpus = __get_cpu_var(load_balance_mask);
+ struct cpumask *cpus = __this_cpu_read(load_balance_mask);

struct lb_env env = {
.sd = sd,