Re: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

From: Ingo Molnar
Date: Wed Feb 13 2008 - 07:57:59 EST



* David Miller <davem@xxxxxxxxxxxxx> wrote:

> The kernel now derefernces per-cpu variables very early, essentially
> in the very first printk() (via printk()'s call to cpu_clock()).
>
> This bit me on sparc64 because of how I do the per-cpu address
> formation. If I booted on a non-zero cpuid things would explode.
>
> You might be hitting something similar.

hm. But the raw sched_clock() use was wrong. We could either go back to
jiffies (which is certainly the simplest and was used before
printk_clock() was introduced which incorrectly relied on sched_clock())
- but that loses precision and the same issue will re-visit us once we
go totally tickless and start to map jiffies to GTOD ...

so .. how about the patch below? Note that we already had an "early
bootup" special (the rq->idle check), it's now just made explicit via
the scheduler_running flag.

Ingo

---------------------->
Subject: sched: make sched_clock() early-bootup capable
From: Ingo Molnar <mingo@xxxxxxx>
Date: Wed Feb 13 13:49:36 CET 2008

do not call sched_clock() too early. Not only might rq->idle
not be set up - but pure per-cpu data might not be accessible
either.

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
kernel/sched.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

Index: linux-x86.q/kernel/sched.c
===================================================================
--- linux-x86.q.orig/kernel/sched.c
+++ linux-x86.q/kernel/sched.c
@@ -666,6 +666,8 @@ const_debug unsigned int sysctl_sched_rt
*/
const_debug unsigned int sysctl_sched_rt_ratio = 62259;

+static __readmostly int scheduler_running;
+
/*
* For kernel-internal use: high-speed (but slightly incorrect) per-cpu
* clock constructed from sched_clock():
@@ -676,14 +678,16 @@ unsigned long long cpu_clock(int cpu)
unsigned long flags;
struct rq *rq;

- local_irq_save(flags);
- rq = cpu_rq(cpu);
/*
* Only call sched_clock() if the scheduler has already been
* initialized (some code might call cpu_clock() very early):
*/
- if (rq->idle)
- update_rq_clock(rq);
+ if (unlikely(!scheduler_running))
+ return 0;
+
+ local_irq_save(flags);
+ rq = cpu_rq(cpu);
+ update_rq_clock(rq);
now = rq->clock;
local_irq_restore(flags);

@@ -7255,6 +7259,8 @@ void __init sched_init(void)
* During early bootup we pretend to be a normal task:
*/
current->sched_class = &fair_sched_class;
+
+ scheduler_running = 1;
}

#ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/