lockdep splat during suspend

From: Borislav Petkov
Date: Thu Nov 17 2016 - 12:31:05 EST


Hi dudes,

this is what I'm seeing on 4.9-rc5+tip/master during suspend. (And yes,
that printk thing is nuts):

[ 87.471336] hib.sh (2591): drop_caches: 3
[ 87.779292] PM: Syncing filesystems ...
[ 87.788772] done.
[ 87.788853] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 87.791447] PM: Preallocating image memory... done (allocated 105192 pages)
[ 88.074812] PM: Allocated 420768 kbytes in 0.28 seconds (1502.74 MB/s)
[ 88.074852] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[ 88.080163] ACPI : EC: event blocked
[ 88.334267] PM: freeze of devices complete after 256.808 msecs
[ 88.353487] PM: late freeze of devices complete after 19.202 msecs
[ 88.354728] ACPI : EC: interrupt blocked
[ 88.355761] PM: noirq freeze of devices complete after 2.257 msecs
[ 88.355777] Disabling non-boot CPUs ...
[ 88.378463] Broke affinity for irq 16
[ 88.378468] Broke affinity for irq 18
[ 88.378471] Broke affinity for irq 19
[ 88.379835] smpboot: CPU 1 is now offline

[ 88.402872] ======================================================
[ 88.402873] [ INFO: possible circular locking dependency detected ]
[ 88.402874] 4.9.0-rc5+ #1 Not tainted
[ 88.402875] -------------------------------------------------------
[ 88.402875] hib.sh/2591 is trying to acquire lock:
[ 88.402876] (
[ 88.402877] console_lock
[ 88.402877] ){+.+.+.}
[ 88.402883] , at:
[ 88.402883] [<ffffffff810d26de>] console_cpu_notify+0xe/0x20
[ 88.402883]
but task is already holding lock:
[ 88.402884] (
[ 88.402884] cpu_hotplug.lock
[ 88.402884] ){+.+.+.}
[ 88.402886] , at:
[ 88.402886] [<ffffffff8106bd52>] cpu_hotplug_begin+0x72/0xc0
[ 88.402887]
which lock already depends on the new lock.

[ 88.402887]
the existing dependency chain (in reverse order) is:
[ 88.402888]
-> #3
[ 88.402888] (
[ 88.402888] cpu_hotplug.lock
[ 88.402889] ){+.+.+.}
[ 88.402889] :
[ 88.402892]
[ 88.402892] [<ffffffff810c11e2>] lock_acquire+0xb2/0x200
[ 88.402895]
[ 88.402895] [<ffffffff817582c4>] mutex_lock_nested+0x54/0x600
[ 88.402898]
[ 88.402898] [<ffffffff8106a063>] get_online_cpus+0x53/0x70
[ 88.402901]
[ 88.402901] [<ffffffff810def14>] _rcu_barrier+0xa4/0x160
[ 88.402902]
[ 88.402902] [<ffffffff810df025>] rcu_barrier+0x15/0x20
[ 88.402905]
[ 88.402905] [<ffffffff8164d024>] netdev_run_todo+0x64/0x300
[ 88.402907]
[ 88.402907] [<ffffffff8165a9de>] rtnl_unlock+0xe/0x10
[ 88.402909]
[ 88.402909] [<ffffffffa02f6bff>] 0xffffffffa02f6bff
[ 88.402910]
[ 88.402910] [<ffffffffa04e83f8>] 0xffffffffa04e83f8
[ 88.402911]
[ 88.402911] [<ffffffffa04e07cb>] 0xffffffffa04e07cb
[ 88.402914]
[ 88.402914] [<ffffffffa0264d29>] msr_exit+0x962/0xc39 [msr]
[ 88.402914]
[ 88.402914] [<ffffffffa04f68cf>] 0xffffffffa04f68cf
[ 88.402917]
[ 88.402917] [<ffffffff81103985>] SyS_delete_module+0x185/0x250
[ 88.402918]
[ 88.402919] [<ffffffff8175e1ee>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 88.402919]
-> #2
[ 88.402920] (
[ 88.402920] rcu_preempt_state.barrier_mutex
[ 88.402920] ){+.+...}
[ 88.402920] :
[ 88.402922]
[ 88.402922] [<ffffffff810c11e2>] lock_acquire+0xb2/0x200
[ 88.402924]
[ 88.402924] [<ffffffff817582c4>] mutex_lock_nested+0x54/0x600
[ 88.402925]
[ 88.402925] [<ffffffff810deea6>] _rcu_barrier+0x36/0x160
[ 88.402927]
[ 88.402927] [<ffffffff810df025>] rcu_barrier+0x15/0x20
[ 88.402930]
[ 88.402930] [<ffffffff814b4183>] i915_gem_shrink_all+0x23/0x30
[ 88.402932]
[ 88.402932] [<ffffffff814b0c4a>] i915_gem_freeze+0x2a/0x50
[ 88.402935]
[ 88.402935] [<ffffffff8146ab32>] i915_pm_freeze+0x22/0x30
[ 88.402938]
[ 88.402938] [<ffffffff8137c29e>] pci_pm_freeze+0x5e/0xe0
[ 88.402940]
[ 88.402940] [<ffffffff81546fae>] dpm_run_callback+0x4e/0x2d0
[ 88.402942]
[ 88.402942] [<ffffffff81547c56>] __device_suspend+0xf6/0x2c0
[ 88.402943]
[ 88.402943] [<ffffffff81547e3f>] async_suspend+0x1f/0xa0
[ 88.402945]
[ 88.402945] [<ffffffff81092237>] async_run_entry_fn+0x37/0xe0
[ 88.402947]
[ 88.402947] [<ffffffff81088018>] process_one_work+0x1e8/0x730
[ 88.402948]
[ 88.402948] [<ffffffff810885a8>] worker_thread+0x48/0x4e0
[ 88.402950]
[ 88.402950] [<ffffffff8108ed04>] kthread+0xf4/0x110
[ 88.402952]
[ 88.402952] [<ffffffff8175e477>] ret_from_fork+0x27/0x40
[ 88.402952]
-> #1
[ 88.402952] (
[ 88.402953] &dev->struct_mutex
[ 88.402953] ){+.+.+.}
[ 88.402953] :
[ 88.402955]
[ 88.402955] [<ffffffff810c11e2>] lock_acquire+0xb2/0x200
[ 88.402957]
[ 88.402957] [<ffffffff817582c4>] mutex_lock_nested+0x54/0x600
[ 88.402960]
[ 88.402960] [<ffffffff8150ba85>] intel_fbdev_pan_display+0x35/0x70
[ 88.402962]
[ 88.402962] [<ffffffff813a130b>] fb_pan_display+0xdb/0x170
[ 88.402964]
[ 88.402964] [<ffffffff8139b870>] bit_update_start+0x20/0x50
[ 88.402965]
[ 88.402965] [<ffffffff8139872f>] fbcon_switch+0x3bf/0x610
[ 88.402968]
[ 88.402968] [<ffffffff814040a1>] redraw_screen+0x151/0x220
[ 88.402969]
[ 88.402969] [<ffffffff81397eb6>] fbcon_do_set_font+0x246/0x3e0
[ 88.402970]
[ 88.402970] [<ffffffff81398324>] fbcon_set_font+0x1f4/0x240
[ 88.402972]
[ 88.402973] [<ffffffff81409209>] con_font_op+0x3e9/0x490
[ 88.402974]
[ 88.402974] [<ffffffff813fb21d>] vt_ioctl+0x42d/0x13d0
[ 88.402976]
[ 88.402976] [<ffffffff813eed8e>] tty_ioctl+0x3ae/0xee0
[ 88.402979]
[ 88.402979] [<ffffffff812040a3>] do_vfs_ioctl+0x93/0x6a0
[ 88.402980]
[ 88.402980] [<ffffffff812046f1>] SyS_ioctl+0x41/0x70
[ 88.402982]
[ 88.402982] [<ffffffff8175e1ee>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 88.402982]
-> #0
[ 88.402983] (
[ 88.402983] console_lock
[ 88.402983] ){+.+.+.}
[ 88.402983] :
[ 88.402985]
[ 88.402985] [<ffffffff810c0b5d>] __lock_acquire+0x11cd/0x1440
[ 88.402987]
[ 88.402987] [<ffffffff810c11e2>] lock_acquire+0xb2/0x200
[ 88.402989]
[ 88.402989] [<ffffffff810cf77c>] console_lock+0x3c/0x60
[ 88.402991]
[ 88.402991] [<ffffffff810d26de>] console_cpu_notify+0xe/0x20
[ 88.402992]
[ 88.402992] [<ffffffff8106aa59>] cpuhp_invoke_callback+0xa9/0x8b0
[ 88.402993]
[ 88.402993] [<ffffffff8106b472>] cpuhp_down_callbacks+0x42/0x80
[ 88.402995]
[ 88.402995] [<ffffffff817532a9>] _cpu_down+0xa9/0x110
[ 88.402997]
[ 88.402997] [<ffffffff8106c466>] freeze_secondary_cpus+0xb6/0x3e0
[ 88.402998]
[ 88.402998] [<ffffffff810c8731>] hibernation_snapshot+0x201/0x360
[ 88.403000]
[ 88.403000] [<ffffffff810c8f82>] hibernate+0x152/0x200
[ 88.403001]
[ 88.403001] [<ffffffff810c5e86>] state_store+0xd6/0xe0
[ 88.403003]
[ 88.403003] [<ffffffff8133f71f>] kobj_attr_store+0xf/0x20
[ 88.403006]
[ 88.403006] [<ffffffff8126f645>] sysfs_kf_write+0x45/0x60
[ 88.403008]
[ 88.403008] [<ffffffff8126e8f5>] kernfs_fop_write+0x135/0x1c0
[ 88.403010]
[ 88.403010] [<ffffffff811ee1a8>] __vfs_write+0x28/0x120
[ 88.403011]
[ 88.403011] [<ffffffff811ef2b8>] vfs_write+0xb8/0x1b0
[ 88.403013]
[ 88.403013] [<ffffffff811f0609>] SyS_write+0x49/0xa0
[ 88.403014]
[ 88.403014] [<ffffffff8175e1ee>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 88.403014]
other info that might help us debug this:

[ 88.403015] Chain exists of:

[ 88.403015] console_lock
[ 88.403016] -->
[ 88.403016] rcu_preempt_state.barrier_mutex
[ 88.403017] -->
[ 88.403017] cpu_hotplug.lock
[ 88.403017]

[ 88.403017] Possible unsafe locking scenario:

[ 88.403018] CPU0 CPU1
[ 88.403018] ---- ----
[ 88.403018] lock(
[ 88.403019] cpu_hotplug.lock
[ 88.403019] );
[ 88.403020] lock(
[ 88.403020] rcu_preempt_state.barrier_mutex
[ 88.403020] );
[ 88.403021] lock(
[ 88.403021] cpu_hotplug.lock
[ 88.403021] );
[ 88.403021] lock(
[ 88.403022] console_lock
[ 88.403022] );
[ 88.403022]
*** DEADLOCK ***

[ 88.403023] 8 locks held by hib.sh/2591:
[ 88.403023] #0:
[ 88.403024] (
[ 88.403024] sb_writers
[ 88.403025] #5
[ 88.403025] ){.+.+.+}
[ 88.403026] , at:
[ 88.403026] [<ffffffff811ef394>] vfs_write+0x194/0x1b0
[ 88.403026] #1:
[ 88.403027] (
[ 88.403027] &of->mutex
[ 88.403027] ){+.+.+.}
[ 88.403029] , at:
[ 88.403029] [<ffffffff8126e8c1>] kernfs_fop_write+0x101/0x1c0
[ 88.403029] #2:
[ 88.403030] (
[ 88.403030] s_active
[ 88.403031] #197
[ 88.403031] ){.+.+.+}
[ 88.403032] , at:
[ 88.403032] [<ffffffff8126e8c9>] kernfs_fop_write+0x109/0x1c0
[ 88.403033] #3:
[ 88.403033] (
[ 88.403033] pm_mutex
[ 88.403034] ){+.+.+.}
[ 88.403035] , at:
[ 88.403035] [<ffffffff810c8e71>] hibernate+0x41/0x200
[ 88.403036] #4:
[ 88.403036] (
[ 88.403036] device_hotplug_lock
[ 88.403037] ){+.+.+.}
[ 88.403038] , at:
[ 88.403038] [<ffffffff81536a77>] lock_device_hotplug+0x17/0x20
[ 88.403039] #5:
[ 88.403039] (
[ 88.403040] cpu_add_remove_lock
[ 88.403040] ){+.+.+.}
[ 88.403041] , at:
[ 88.403041] [<ffffffff8106c3d7>] freeze_secondary_cpus+0x27/0x3e0
[ 88.403042] #6:
[ 88.403042] (
[ 88.403042] cpu_hotplug.dep_map
[ 88.403043] ){++++++}
[ 88.403044] , at:
[ 88.403044] [<ffffffff8106bce5>] cpu_hotplug_begin+0x5/0xc0
[ 88.403044] #7:
[ 88.403044] (
[ 88.403045] cpu_hotplug.lock
[ 88.403045] ){+.+.+.}
[ 88.403046] , at:
[ 88.403046] [<ffffffff8106bd52>] cpu_hotplug_begin+0x72/0xc0
[ 88.403046]
stack backtrace:
[ 88.403048] CPU: 0 PID: 2591 Comm: hib.sh Not tainted 4.9.0-rc5+ #1
[ 88.403049] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
[ 88.403049] Call Trace:
[ 88.403053] dump_stack+0x67/0x92
[ 88.403055] print_circular_bug+0x1e3/0x250
[ 88.403056] __lock_acquire+0x11cd/0x1440
[ 88.403059] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 88.403061] lock_acquire+0xb2/0x200
[ 88.403063] ? console_cpu_notify+0xe/0x20
[ 88.403065] ? register_console+0x380/0x380
[ 88.403066] console_lock+0x3c/0x60
[ 88.403068] ? console_cpu_notify+0xe/0x20
[ 88.403069] console_cpu_notify+0xe/0x20
[ 88.403070] cpuhp_invoke_callback+0xa9/0x8b0
[ 88.403072] ? __flow_cache_shrink+0x140/0x140
[ 88.403073] cpuhp_down_callbacks+0x42/0x80
[ 88.403075] _cpu_down+0xa9/0x110
[ 88.403076] freeze_secondary_cpus+0xb6/0x3e0
[ 88.403077] hibernation_snapshot+0x201/0x360
[ 88.403078] hibernate+0x152/0x200
[ 88.403080] state_store+0xd6/0xe0
[ 88.403081] kobj_attr_store+0xf/0x20
[ 88.403082] sysfs_kf_write+0x45/0x60
[ 88.403084] kernfs_fop_write+0x135/0x1c0
[ 88.403086] __vfs_write+0x28/0x120
[ 88.403087] ? preempt_count_sub+0xa1/0x110
[ 88.403089] ? __sb_start_write+0xea/0x200
[ 88.403090] ? vfs_write+0x194/0x1b0
[ 88.403091] vfs_write+0xb8/0x1b0
[ 88.403092] ? trace_hardirqs_on_caller+0x12d/0x1b0
[ 88.403093] SyS_write+0x49/0xa0
[ 88.403095] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 88.403096] RIP: 0033:0x7f37f6dc51c0
[ 88.403097] RSP: 002b:00007ffe98060948 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 88.403098] RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007f37f6dc51c0
[ 88.403098] RDX: 0000000000000005 RSI: 00000000013f5408 RDI: 0000000000000001
[ 88.403099] RBP: 0000000000000004 R08: 00007f37f7088780 R09: 00007f37f78d8700
[ 88.403100] R10: 0000000000000004 R11: 0000000000000246 R12: 00000000004be368
[ 88.403100] R13: 0000000000002002 R14: 00000000013fe400 R15: 0000000000000070
[ 88.417290] Broke affinity for irq 16
[ 88.417293] Broke affinity for irq 18
[ 88.417295] Broke affinity for irq 19
[ 88.418315] smpboot: CPU 2 is now offline
[ 88.461454] Broke affinity for irq 1
[ 88.461460] Broke affinity for irq 8
[ 88.461464] Broke affinity for irq 9
[ 88.461468] Broke affinity for irq 12
[ 88.461473] Broke affinity for irq 16
[ 88.461477] Broke affinity for irq 18
[ 88.461480] Broke affinity for irq 19
[ 88.461485] Broke affinity for irq 23
[ 88.461489] Broke affinity for irq 26
[ 88.461493] Broke affinity for irq 27
[ 88.462528] smpboot: CPU 3 is now offline
[ 88.473978] PM: Creating hibernation image:
[ 88.733315] PM: Need to copy 104116 pages
[ 88.475760] Suspended for 19.000 seconds
[ 88.475851] Enabling non-boot CPUs ...
[ 88.487470] x86: Booting SMP configuration:
[ 88.487479] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 88.492153] cache: parent cpu1 should not be sleeping
[ 88.492571] CPU1 is up
[ 88.507516] smpboot: Booting Node 0 Processor 2 APIC 0x2
[ 88.511244] cache: parent cpu2 should not be sleeping
[ 88.511626] CPU2 is up
[ 88.527572] smpboot: Booting Node 0 Processor 3 APIC 0x3
[ 88.531225] cache: parent cpu3 should not be sleeping
[ 88.531617] CPU3 is up
[ 88.537517] ACPI : EC: interrupt unblocked
[ 88.540582] sdhci-pci 0000:02:00.0: MMC controller base frequency changed to 50Mhz.
[ 88.583150] PM: noirq restore of devices complete after 45.897 msecs
[ 88.583639] PM: early restore of devices complete after 0.459 msecs
[ 88.625421] usb usb1: root hub lost power or was reset
[ 88.625432] usb usb2: root hub lost power or was reset
[ 88.625608] usb usb3: root hub lost power or was reset
[ 88.625883] usb usb4: root hub lost power or was reset
[ 88.626879] ACPI : button: The lid device is not compliant to SW_LID.
[ 88.626910] ACPI : EC: event unblocked
[ 88.627295] rtc_cmos 00:02: System wakeup disabled by ACPI
[ 88.629508] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
[ 88.629783] ehci-pci 0000:00:1d.0: cache line size of 64 is not supported
[ 88.636192] sd 2:0:0:0: [sdb] Starting disk
[ 88.636228] sd 0:0:0:0: [sda] Starting disk
[ 88.963083] usb 4-1: reset high-speed USB device number 2 using ehci-pci
[ 88.963085] usb 3-1: reset high-speed USB device number 2 using ehci-pci
[ 88.974052] ata5: SATA link down (SStatus 0 SControl 300)
[ 88.974093] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 88.974136] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 88.974583] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 88.974596] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 88.975119] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 88.975132] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 88.975549] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 88.975562] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 88.975887] ata3.00: configured for UDMA/100
[ 88.977083] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 88.977112] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 88.977937] ata1.00: configured for UDMA/133
[ 89.671711] PM: restore of devices complete after 1046.321 msecs
[ 89.673226] Restarting tasks ... done.
[ 89.673738] usb 1-1: USB disconnect, device number 2
[ 89.823083] usb 1-1: new low-speed USB device number 3 using xhci_hcd
[ 89.969125] usb 1-1: New USB device found, idVendor=046d, idProduct=c045
[ 89.969135] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 89.969141] usb 1-1: Product: USB-PS/2 Optical Mouse
[ 89.969146] usb 1-1: Manufacturer: Logitech
[ 89.974053] input: Logitech USB-PS/2 Optical Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-1/1-1:1.0/0003:046D:C045.0002/input/input20
[ 90.031487] hid-generic 0003:046D:C045.0002: input,hidraw0: USB HID v1.10 Mouse [Logitech USB-PS/2 Optical Mouse] on usb-0000:00:14.0-1/input0
[ 90.340587] psmouse serio1: synaptics: queried max coordinates: x [..5768], y [..5062]
[ 90.371618] psmouse serio1: synaptics: queried min coordinates: x [1174..], y [790..]
[ 90.421219] e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.