Re: perf related boot hang.

From: Peter Zijlstra
Date: Wed Aug 06 2014 - 12:20:01 EST


On Wed, Aug 06, 2014 at 10:36:21AM -0400, Dave Jones wrote:
> On Linus current tree, when I cold-boot one of my boxes, it locks up
> during boot up with this trace..
>
> Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 2
> CPU: 2 PID: 577 Comm: in:imjournal Not tainted 3.16.0+ #33
> ffff880244c06c88 000000008b73013e ffff880244c06bf0 ffffffffb47ee207
> ffffffffb4c51118 ffff880244c06c78 ffffffffb47ebcf8 0000000000000010
> ffff880244c06c88 ffff880244c06c20 000000008b73013e 0000000000000000
> Call Trace:
> <NMI> [<ffffffffb47ee207>] dump_stack+0x4e/0x7a
> [<ffffffffb47ebcf8>] panic+0xd4/0x207
> [<ffffffffb4145448>] watchdog_overflow_callback+0x118/0x120
> [<ffffffffb4186f0e>] __perf_event_overflow+0xae/0x350
> [<ffffffffb4185380>] ? perf_event_task_disable+0xa0/0xa0
> [<ffffffffb401a4ef>] ? x86_perf_event_set_period+0xbf/0x150
> [<ffffffffb4187d34>] perf_event_overflow+0x14/0x20
> [<ffffffffb40203a6>] intel_pmu_handle_irq+0x206/0x410
> [<ffffffffb401939b>] perf_event_nmi_handler+0x2b/0x50
> [<ffffffffb4007b72>] nmi_handle+0xd2/0x390
> [<ffffffffb4007aa5>] ? nmi_handle+0x5/0x390
> [<ffffffffb40d8301>] ? lock_acquired+0x131/0x450
> [<ffffffffb4008062>] default_do_nmi+0x72/0x1c0
>
>
> If I reset it, it then seems to always boot up fine.

Uhm,. cute! And that's the entire stacktrace? It would seem to me there
would be at least a 'task' context below that. CPUs simply do not _only_
run NMI code, and that trace starts at default_do_nmi().


Attachment: pgp2ITV7vTKCs.pgp
Description: PGP signature