Re: 2.6.21-rc7-mm2 hangs in boot (netconsole)

From: Andrew Morton
Date: Tue May 01 2007 - 01:39:30 EST


On Tue, 1 May 2007 08:24:56 +0200 Andi Kleen <ak@xxxxxxx> wrote:

> > The bug is in firstfloor only, and the fix (if present) will be there too.
> >
> > <checks>
> >
> > Nope,
> >
> > ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/sched-clock-share
> >
> > is identical to
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc7/2.6.21-rc7-mm2/broken-out/x86_64-mm-sched-clock-share.patch
>
> Or perhaps the deadlock is in the cpufrequency handler. Does it happen without CONFIG_CPUFREQ
> too?
>
> [cpufreq handler calls ktime_get which might take xtime lock for reading]
>

Sounds right. That's what was happening to me for a while.

Randy, it'd be interesting to try:

--- a/arch/x86_64/kernel/tsc.c~a
+++ a/arch/x86_64/kernel/tsc.c
@@ -84,8 +84,8 @@ static int time_cpufreq_notifier(struct
cpufreq_scale(loops_per_jiffy_ref, ref_freq, freq->new);

tsc_khz = cpufreq_scale(tsc_khz_ref, ref_freq, freq->new);
- if (!(freq->flags & CPUFREQ_CONST_LOOPS))
- mark_tsc_unstable("cpufreq changes");
+// if (!(freq->flags & CPUFREQ_CONST_LOOPS))
+// mark_tsc_unstable("cpufreq changes");
}

return 0;
_

and if that "fixes" it, disable netconsole and do

--- a/arch/x86_64/kernel/tsc.c~a
+++ a/arch/x86_64/kernel/tsc.c
@@ -85,7 +85,7 @@ static int time_cpufreq_notifier(struct

tsc_khz = cpufreq_scale(tsc_khz_ref, ref_freq, freq->new);
if (!(freq->flags & CPUFREQ_CONST_LOOPS))
- mark_tsc_unstable("cpufreq changes");
+ dump_stack();
}

return 0;
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/