x86/pti: smp_processor_id() called while preemptible in resume-from-sleep

From: Dominik Brodowski
Date: Sat Dec 30 2017 - 08:30:04 EST


Dear all,

resume-from-sleep (mem/S3) on v4.15-rc5-149-g5aa90a845892 triggers the
following bug. If I boot with "pti=off", the kernel does not show this
issue, and neither did kernels before pti was merged:

[ 0.000000] microcode: microcode updated early to revision 0x25, date = 2017-01-27
[ 0.000000] Linux version 4.15.0-rc5+ (brodo@light) (gcc version 7.2.1 20171128 (GCC)) #2 SMP PREEMPT Sat Dec 30 12:03:51 CET 2017
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-testing-git page_poison=on slub_debug=P
...
[ 0.000000] Memory: 7922664K/8291016K available (18460K kernel code, 2408K rwdata, 6548K rodata, 3440K init, 13116K bss, 368352K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[ 0.000000] Kernel/User page tables isolation: enabled
...
[ 38.829235] PM: suspend entry (deep)
...
[ 39.893319] Disabling non-boot CPUs ...
[ 39.911270] smpboot: CPU 1 is now offline
[ 39.928370] smpboot: CPU 2 is now offline
[ 39.947837] smpboot: CPU 3 is now offline
[ 39.951703] ACPI: Low-level resume complete
[ 39.951832] ACPI: EC: EC started
[ 39.951840] PM: Restoring platform NVS memory
[ 39.954648] Enabling non-boot CPUs ...
[ 39.954792] x86: Booting SMP configuration:
[ 39.954800] smpboot: Booting Node 0 Processor 1 APIC 0x2
[ 39.954834] BUG: using smp_processor_id() in preemptible [00000000] code: sh/465
[ 39.954841] caller is native_cpu_up+0x2f0/0xa30
[ 39.954847] CPU: 0 PID: 465 Comm: sh Not tainted 4.15.0-rc5+ #2
[ 39.954851] Hardware name: Dell Inc. XPS 13 9343/0TM99H, BIOS A11 12/08/2016
[ 39.954855] Call Trace:
[ 39.954863] dump_stack+0x67/0x95
[ 39.954871] check_preemption_disabled+0xd8/0xe0
[ 39.954880] native_cpu_up+0x2f0/0xa30
[ 39.954896] bringup_cpu+0x25/0xa0
[ 39.954902] ? cpuhp_kick_ap+0x70/0x70
[ 39.954909] cpuhp_invoke_callback+0xb8/0xc50
[ 39.954930] _cpu_up+0xad/0x170
[ 39.954943] enable_nonboot_cpus+0x9e/0x320
[ 39.954953] suspend_devices_and_enter+0x33c/0xd40
[ 39.954974] pm_suspend+0x6a0/0x9e0
[ 39.954988] state_store+0x7d/0xf0
[ 39.955002] kernfs_fop_write+0x11c/0x1b0
[ 39.955014] __vfs_write+0x39/0x1d0
[ 39.955024] ? rcu_read_lock_sched_held+0x74/0x80
[ 39.955029] ? preempt_count_sub+0x92/0xd0
[ 39.955036] ? __sb_start_write+0x16a/0x1f0
[ 39.955047] vfs_write+0xcc/0x1b0
[ 39.955058] SyS_write+0x55/0xc0
[ 39.955072] entry_SYSCALL_64_fastpath+0x18/0x85
[ 39.955077] RIP: 0033:0x4b9a7e
[ 39.955081] RSP: 002b:00007ffda426c148 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 39.955088] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00000000004b9a7e
[ 39.955092] RDX: 0000000000000004 RSI: 00000000023da740 RDI: 0000000000000001
[ 39.955095] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 39.955099] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 39.955103] R13: 00000000023d7020 R14: 00000000023d7418 R15: 0000000000000000
[ 39.956435] BUG: using smp_processor_id() in preemptible [00000000] code: sh/465
[ 39.956442] caller is native_cpu_up+0x447/0xa30
[ 39.956448] CPU: 0 PID: 465 Comm: sh Not tainted 4.15.0-rc5+ #2
[ 39.956452] Hardware name: Dell Inc. XPS 13 9343/0TM99H, BIOS A11 12/08/2016
[ 39.956455] Call Trace:
[ 39.956465] dump_stack+0x67/0x95
[ 39.956473] check_preemption_disabled+0xd8/0xe0
[ 39.956482] native_cpu_up+0x447/0xa30
[ 39.956498] bringup_cpu+0x25/0xa0
[ 39.956504] ? cpuhp_kick_ap+0x70/0x70
[ 39.956511] cpuhp_invoke_callback+0xb8/0xc50
[ 39.956532] _cpu_up+0xad/0x170
[ 39.956544] enable_nonboot_cpus+0x9e/0x320
[ 39.956555] suspend_devices_and_enter+0x33c/0xd40
[ 39.956575] pm_suspend+0x6a0/0x9e0
[ 39.956589] state_store+0x7d/0xf0
[ 39.956603] kernfs_fop_write+0x11c/0x1b0
[ 39.956615] __vfs_write+0x39/0x1d0
[ 39.956625] ? rcu_read_lock_sched_held+0x74/0x80
[ 39.956630] ? preempt_count_sub+0x92/0xd0
[ 39.956637] ? __sb_start_write+0x16a/0x1f0
[ 39.956649] vfs_write+0xcc/0x1b0
[ 39.956660] SyS_write+0x55/0xc0
[ 39.956673] entry_SYSCALL_64_fastpath+0x18/0x85
[ 39.956678] RIP: 0033:0x4b9a7e
[ 39.956682] RSP: 002b:00007ffda426c148 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 39.956689] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00000000004b9a7e
[ 39.956693] RDX: 0000000000000004 RSI: 00000000023da740 RDI: 0000000000000001
[ 39.956696] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 39.956700] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 39.956704] R13: 00000000023d7020 R14: 00000000023d7418 R15: 0000000000000000
[ 39.958429] cache: parent cpu1 should not be sleeping
[ 39.959881] CPU1 is up
[ 39.960013] smpboot: Booting Node 0 Processor 2 APIC 0x1
[ 39.960023] BUG: using smp_processor_id() in preemptible [00000000] code: sh/465

... and then the same for CPUs 2 and 3, meaning the BUG() is triggered six
times overall, twice for each non-boot CPU.

Thanks,
Dominik