Re: x86 boot broken on -rc1?
From: BjÃrn TÃpel
Date: Wed Dec 13 2017 - 14:37:11 EST
2017-12-02 1:39 GMT+01:00 Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx>:
> Hi!
>
> I'm hitting these after DaveM pulled rc1 into net-next on my Xeon
> E5-2630 v4 box. It also happens on linux-next. Did anyone else
> experience it? (.config attached)
>
> [ 5.003771] WARNING: CPU: 14 PID: 1 at ../arch/x86/events/intel/uncore.c:936 uncore_pci_probe+0x285/0x2b0
> [ 5.007544] Modules linked in:
> [ 5.007544] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc1-perf-00225-gb2a4e0a76b1d #782
> [ 5.007544] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
> [ 5.007544] task: 000000009e842725 task.stack: 000000008a63fd2d
> [ 5.007544] RIP: 0010:uncore_pci_probe+0x285/0x2b0
> [ 5.007544] RSP: 0000:ffffad8580163d10 EFLAGS: 00010286
> [ 5.007544] RAX: ffff98576cc3df30 RBX: ffffffffb08037e0 RCX: ffffffffb0c1a120
> [ 5.007544] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb0c1a960
> [ 5.007544] RBP: ffff985b6c00ac00 R08: fffffffffffffffe R09: 00000000000fffff
> [ 5.007544] R10: ffff98576f1b6018 R11: 0000000000000022 R12: ffff985b6c641000
> [ 5.007544] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000001
> [ 5.007544] FS: 0000000000000000(0000) GS:ffff98576fb80000(0000) knlGS:0000000000000000
> [ 5.007544] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5.007544] CR2: 0000000000000000 CR3: 0000000185c09001 CR4: 00000000003606e0
> [ 5.007544] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 5.007544] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 5.007544] Call Trace:
> [ 5.007544] local_pci_probe+0x3d/0x90
> [ 5.007544] ? pci_match_device+0xd9/0x100
> [ 5.007544] pci_device_probe+0x122/0x180
> [ 5.007544] driver_probe_device+0x246/0x330
> [ 5.007544] ? set_debug_rodata+0x11/0x11
> [ 5.007544] __driver_attach+0x8a/0x90
> [ 5.007544] ? driver_probe_device+0x330/0x330
> [ 5.007544] bus_for_each_dev+0x5c/0x90
> [ 5.007544] bus_add_driver+0x196/0x220
> [ 5.007544] driver_register+0x57/0xc0
> [ 5.007544] intel_uncore_init+0x1e3/0x249
> [ 5.007544] ? uncore_type_init+0x193/0x193
> [ 5.007544] ? set_debug_rodata+0x11/0x11
> [ 5.007544] do_one_initcall+0x4b/0x190
> [ 5.007544] kernel_init_freeable+0x16e/0x1f5
> [ 5.007544] ? rest_init+0xd0/0xd0
> [ 5.007544] kernel_init+0xa/0x100
> [ 5.007544] ret_from_fork+0x1f/0x30
> [ 5.007544] Code: 48 8b 52 08 48 85 d2 74 0d 89 44 24 04 48 89 df ff d2 8b 44 24 04 48 89 df 89 44 24 04 e8 54 0a 1c 00 8b 44 24 0
> [ 5.007544] ---[ end trace 4dc4c3d5f5afcd2f ]---
> [ 5.244504] bdx_uncore: probe of 0000:ff:08.2 failed with error -22
> [ 5.251604] bdx_uncore: probe of 0000:ff:0b.1 failed with error -22
> [ 5.258711] bdx_uncore: probe of 0000:ff:10.1 failed with error -22
> [ 5.265819] bdx_uncore: probe of 0000:ff:14.0 failed with error -22
> [ 5.272919] bdx_uncore: probe of 0000:ff:14.1 failed with error -22
> [ 5.280019] bdx_uncore: probe of 0000:ff:15.0 failed with error -22
> [ 5.287112] bdx_uncore: probe of 0000:ff:15.1 failed with error -22
> [ 5.294376] WARNING: CPU: 1 PID: 15 at ../arch/x86/events/intel/uncore.c:1065 uncore_change_type_ctx.isra.5+0xe6/0xf0
> [ 5.298362] Modules linked in:
> [ 5.298362] CPU: 1 PID: 15 Comm: cpuhp/1 Tainted: G W 4.15.0-rc1-perf-00225-gb2a4e0a76b1d #782
> [ 5.298362] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
> [ 5.298362] task: 00000000ae78bc8f task.stack: 00000000f79660c1
> [ 5.298362] RIP: 0010:uncore_change_type_ctx.isra.5+0xe6/0xf0
> [ 5.298362] RSP: 0000:ffffad85833b3db8 EFLAGS: 00010213
> [ 5.298362] RAX: 0000000000000000 RBX: ffff9857669b0200 RCX: 0000000000000001
> [ 5.298362] RDX: ffff985b6f000000 RSI: ffff985b66580400 RDI: ffffffffb0c1ae8c
> [ 5.298362] RBP: ffff985b66580400 R08: ffffffffb0c1ae8c R09: 0000000000000001
> [ 5.298362] R10: 0000000000000000 R11: 00000000003d0900 R12: 0000000000000000
> [ 5.298362] R13: ffffffffffffffff R14: 0000000000000001 R15: 0000000000000008
> [ 5.298362] FS: 0000000000000000(0000) GS:ffff985b6f000000(0000) knlGS:0000000000000000
> [ 5.298362] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5.298362] CR2: 0000000000000000 CR3: 0000000185c09001 CR4: 00000000003606e0
> [ 5.298362] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 5.298362] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 5.298362] Call Trace:
> [ 5.298362] uncore_event_cpu_online+0x283/0x340
> [ 5.298362] ? uncore_event_cpu_offline+0x180/0x180
> [ 5.298362] cpuhp_invoke_callback+0x8c/0x620
> [ 5.298362] ? __schedule+0x1ad/0x6c0
> [ 5.298362] ? sort_range+0x20/0x20
> [ 5.298362] cpuhp_thread_fun+0xbc/0x140
> [ 5.298362] smpboot_thread_fn+0x114/0x1d0
> [ 5.298362] kthread+0x111/0x130
> [ 5.298362] ? kthread_create_on_node+0x40/0x40
> [ 5.298362] ret_from_fork+0x1f/0x30
> [ 5.298362] Code: 2a 44 89 73 10 41 83 c4 01 48 81 c5 40 01 00 00 45 3b 20 7c cf 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f f
> [ 5.298362] ---[ end trace 4dc4c3d5f5afcd30 ]---
> [ 5.504808] Scanning for low memory corruption every 60 seconds
> [ 5.512347] Initialise system trusted keyrings
> [ 5.517470] workingset: timestamp_bits=40 max_order=23 bucket_order=0
> [ 5.524840] BUG: unable to handle kernel paging request at 0000000023314bf4
> [ 5.528761] IP: __kmalloc_track_caller+0xa8/0x210
> [ 5.528761] PGD 185c0a067 P4D 185c0a067 PUD 185c0c067 PMD 0
> [ 5.528761] Oops: 0000 [#1] PREEMPT SMP
> [ 5.528761] Modules linked in:
> [ 5.528761] CPU: 14 PID: 1 Comm: swapper/0 Tainted: G W 4.15.0-rc1-perf-00225-gb2a4e0a76b1d #782
> [ 5.528761] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
> [ 5.528761] task: 000000009e842725 task.stack: 000000008a63fd2d
> [ 5.528761] RIP: 0010:__kmalloc_track_caller+0xa8/0x210
> [ 5.528761] RSP: 0000:ffffad8580163d58 EFLAGS: 00010286
> [ 5.528761] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 000000000012ce0e
> [ 5.528761] RDX: 000000000012cd0e RSI: 000000000012cd0e RDI: 000000000001dde0
> [ 5.528761] RBP: ffff985700000001 R08: ffff98576f407c00 R09: ffffffffb071edbf
> [ 5.528761] R10: ffffd54de1995600 R11: ffff985b6655915f R12: 0000000000000004
> [ 5.528761] R13: 00000000014000c0 R14: ffffffffb026c239 R15: ffff98576f407c00
> [ 5.528761] FS: 0000000000000000(0000) GS:ffff98576fb80000(0000) knlGS:0000000000000000
> [ 5.528761] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5.528761] CR2: ffffffffffffffff CR3: 0000000185c09001 CR4: 00000000003606e0
> [ 5.528761] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 5.528761] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 5.528761] Call Trace:
> [ 5.528761] kstrdup+0x2d/0x60
> [ 5.528761] __kernfs_new_node+0x29/0x130
> [ 5.528761] kernfs_new_node+0x24/0x50
> [ 5.528761] kernfs_create_link+0x29/0x90
> [ 5.528761] sysfs_do_create_link_sd.isra.0+0x5d/0xc0
> [ 5.528761] sysfs_slab_add+0x1f5/0x270
> [ 5.528761] ? set_debug_rodata+0x11/0x11
> [ 5.528761] slab_sysfs_init+0x8b/0xfa
> [ 5.528761] ? kmem_cache_init+0xf9/0xf9
> [ 5.528761] do_one_initcall+0x4b/0x190
> [ 5.528761] kernel_init_freeable+0x16e/0x1f5
> [ 5.528761] ? rest_init+0xd0/0xd0
> [ 5.528761] kernel_init+0xa/0x100
> [ 5.528761] ret_from_fork+0x1f/0x30
> [ 5.528761] Code: 49 63 47 20 49 8b 3f 48 8d 8a 00 01 00 00 48 8b 5c 05 00 48 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 74 ab 48 85 db 7
> [ 5.528761] RIP: __kmalloc_track_caller+0xa8/0x210 RSP: ffffad8580163d58
> [ 5.528761] CR2: ffffffffffffffff
> [ 5.528761] ---[ end trace 4dc4c3d5f5afcd31 ]---
> [ 5.773089] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
> [ 5.773089]
> [ 5.777076] Kernel Offset: 0x2f000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [ 5.777076] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
Yes, I'm getting that as well (v4.15-rc2-772-gcdc0974f10cf).
Did you bisect it? I haven't got around yet.
BjÃrn