Re: Latest tip kernel(3.2-rc1-tip_cf6b3899) fails to boot on x3850x5machine

From: Thomas Gleixner
Date: Wed Dec 14 2011 - 08:30:52 EST


On Wed, 14 Dec 2011, Nikunj A Dadhania wrote:

> Hi,
>
> I was trying the latest tip kernel and I am seeing the following
> message. After some more logs the machine does not proceed.
>
> ------------[ cut here ]------------
> WARNING: at kernel/time/clockevents.c:47 clockevent_delta2ns+0x79/0x90()

Ok, that's caused by stupidity in the apic timer code. Fix
below. Note, that patch wont fix the boot problem. It just prevents
the warning.

> Hardware name: System x3850 X5 -[71455RQ]-
> Modules linked in:
> Pid: 1, comm: swapper/0 Not tainted 3.2.0-rc4-tip #3
> Call Trace:
> [<ffffffff8104d3af>] warn_slowpath_common+0x7f/0xc0
> [<ffffffff8104d40a>] warn_slowpath_null+0x1a/0x20
> [<ffffffff81097ab9>] clockevent_delta2ns+0x79/0x90
> [<ffffffff81e49109>] calibrate_APIC_clock+0x16e/0x377
> [<ffffffff81e4942c>] setup_boot_APIC_clock+0x59/0x7e
> [<ffffffff81e472d1>] native_smp_prepare_cpus+0x1f1/0x205
> [<ffffffff81e38767>] kernel_init+0x1b6/0x29a
> [<ffffffff815f9ca4>] kernel_thread_helper+0x4/0x10
> [<ffffffff81e385b1>] ? parse_early_options+0x20/0x20
> [<ffffffff815f9ca0>] ? gs_change+0x13/0x13
> ---[ end trace a7919e7f17c0a725 ]---
> APIC frequency too slow, disabling apic timer

Ouch. This is really bad.

> Performance Events: PEBS fmt1+, Nehalem events, Intel PMU driver.
> CPU erratum AAJ80 worked around
> CPUID marked event: 'bus cycles' unavailable
> ... version: 3
> ... bit width: 48
> ... generic registers: 4
> ... value mask: 0000ffffffffffff
> ... max period: 000000007fffffff
> ... fixed-purpose events: 3
> ... event mask: 000000070000000f
> NMI watchdog enabled, takes one hw-pmu counter.
> Booting Node 0, Processors #1
> APIC never delivered???
> APIC delivery error (ef).
> #2

And probably caused by non working APIC. From your log:

weird, boot CPU (#255) not listed by the BIOS.
LAPIC pending interrupts after 512 EOI

Can you please provide a full boot log up to the point of

Booting Node 0, Processors #1

Your log is missing the head of the boot process. Can you also add
"apic=verbose" to the kernel command line please ?

Thanks,

tglx

---------->
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 39e3eaa..35a5b31 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -696,6 +696,18 @@ static int __init calibrate_APIC_clock(void)
pm_referenced = !calibrate_by_pmtimer(lapic_cal_pm2 - lapic_cal_pm1,
&delta, &deltatsc);

+
+ lapic_timer_frequency = (delta * APIC_DIVISOR) / LAPIC_CAL_LOOPS;
+
+ /*
+ * Do a sanity check on the APIC calibration result
+ */
+ if (lapic_timer_frequency < (1000000 / HZ)) {
+ local_irq_enable();
+ pr_warning("APIC frequency too slow, disabling apic timer\n");
+ return -1;
+ }
+
/* Calculate the scaled math multiplication factor */
lapic_clockevent.mult = div_sc(delta, TICK_NSEC * LAPIC_CAL_LOOPS,
lapic_clockevent.shift);
@@ -704,8 +716,6 @@ static int __init calibrate_APIC_clock(void)
lapic_clockevent.min_delta_ns =
clockevent_delta2ns(0xF, &lapic_clockevent);

- lapic_timer_frequency = (delta * APIC_DIVISOR) / LAPIC_CAL_LOOPS;
-
apic_printk(APIC_VERBOSE, "..... delta %ld\n", delta);
apic_printk(APIC_VERBOSE, "..... mult: %u\n", lapic_clockevent.mult);
apic_printk(APIC_VERBOSE, "..... calibration result: %u\n",
@@ -723,15 +733,6 @@ static int __init calibrate_APIC_clock(void)
lapic_timer_frequency / (1000000 / HZ),
lapic_timer_frequency % (1000000 / HZ));

- /*
- * Do a sanity check on the APIC calibration result
- */
- if (lapic_timer_frequency < (1000000 / HZ)) {
- local_irq_enable();
- pr_warning("APIC frequency too slow, disabling apic timer\n");
- return -1;
- }
-
levt->features &= ~CLOCK_EVT_FEAT_DUMMY;

/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/