lockdep issue booting v4.1 upstream kernel with >64 x86_64 CPUs

From: Michel Lespinasse
Date: Fri Jun 26 2015 - 21:38:31 EST


Hi Peter,

I am getting a minor issue trying to boot a lockdep enabled x86_64
kernel with >64 CPUs.

The kernel boots the first 64 CPUs without issues, but then complains
that lockdep wants to allocate memory while start_secondary ->
init_espfix_ap has IRQs disabled:

[ 0.310566] x86: Booting SMP configuration:
[ 0.310569] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7
#8 #9 #10 #11 #12 #13 #14 #15 #16 #17
[ 0.569224] .... node #1, CPUs: #18 #19 #20 #21 #22 #23 #24 #25
#26 #27 #28 #29 #30 #31 #32 #33 #34 #35
[ 0.922841] .... node #0, CPUs: #36 #37 #38 #39 #40 #41 #42 #43
#44 #45 #46 #47 #48 #49 #50 #51 #52 #53
[ 1.198048] .... node #1, CPUs: #54 #55 #56 #57 #58 #59 #60 #61
#62 #63 #64
[ 1.365553] ------------[ cut here ]------------
[ 1.370300] WARNING: CPU: 64 PID: 0 at
kernel/locking/lockdep.c:2755 lockdep_trace_alloc+0xc5/0xd0()
[ 1.379318] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
[ 1.384648] Modules linked in:
[ 1.387851] CPU: 64 PID: 0 Comm: swapper/64 Not tainted 4.1.0-dbg-DEV #1
[ 1.402368] ffffffff81a1da64 ffff881ff137bc48 ffffffff816eb1cd
ffffffff8111d921
[ 1.409705] ffff881ff137bc98 ffff881ff137bc88 ffffffff810b9897
0000000000000066
[ 1.417043] 0000000000000092 ffff881ffee813b0 00000000000000d0
ffff88207ffff380
[ 1.424379] Call Trace:
[ 1.426799] [<ffffffff816eb1cd>] dump_stack+0x4c/0x65
[ 1.431872] [<ffffffff8111d921>] ? console_unlock+0x1f1/0x510
[ 1.437634] [<ffffffff810b9897>] warn_slowpath_common+0x97/0xe0
[ 1.443565] [<ffffffff810b9996>] warn_slowpath_fmt+0x46/0x50
[ 1.449237] [<ffffffff81113e85>] lockdep_trace_alloc+0xc5/0xd0
[ 1.455086] [<ffffffff811c97dd>] __alloc_pages_nodemask+0xad/0xb90
[ 1.461276] [<ffffffff811317d7>] ? add_timer_on+0x67/0x290
[ 1.466779] [<ffffffff81214747>] alloc_page_interleave+0x37/0x90
[ 1.472796] [<ffffffff81215a65>] alloc_pages_current+0x155/0x170
[ 1.478814] [<ffffffff811c4284>] ? __get_free_pages+0x14/0x50
[ 1.484571] [<ffffffff811c4284>] __get_free_pages+0x14/0x50
[ 1.490162] [<ffffffff8106adc8>] init_espfix_ap+0x1b8/0x260
[ 1.495752] [<ffffffff8109a329>] start_secondary+0xf9/0x170
[ 1.501342] ---[ end trace b6700c7f1c6b959e ]---
[ 1.506350] #65 #66 #67 #68 #69 #70 #71
[ 1.615186] x86: Booted up 2 nodes, 72 CPUs

I think the correct fix may be to change one of the GFP masks in
init_espfix_ap(), but I am not 100% sure.

Thanks,

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/