Re: Freezable workqueue blocks non-freezable workqueue during the system resume process

From: Peter Chen
Date: Tue Feb 23 2016 - 04:52:14 EST


On Tue, Feb 23, 2016 at 11:20:56AM +0800, Peter Chen wrote:
> Hi Tejun Heo and Florian Mickler,
>
> I have a question that during the system resume process, the freezable
> workqueue can be thawed if there is a non-freezable workqueue is
> blocked (At uninterruptable state)?
>
> My case like below, I have a USB OTG (Micro-AB) cable is at USB
> Micro-B port, and there is a USB driver on it, and un-plug this
> cable can wake up system from the suspend. There is a non-freezable
> workqueue ci_otg will be scheduled after disconnecting OTG cable,
> and in its worker ci_otg_work, it will try to disconnect USB drive,
> and flush disk information. But flush disk information is done by
> freezable workqueue writeback, it seeems workqueue writeback is
> never got chance to execute, the workqueue ci_otg is waiting there
> forever, and the system is deadlock.
>

> Both change workqueue ci_otg as freezable or change workqueue writeback
> as non-freezable can fix this problem.
>
Please ignore it, the system is locked at driver's resume,
maybe at scsi or usb driver, so of cos, the freezable processes
can't be thawed.

[ 553.429383] sh D c07de74c 0 694 691 0x00000000
[ 553.435801] Backtrace:
[ 553.438295] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
[ 553.445358] r10:edd3c054 r9:edd3c078 r8:edddbd50 r7:edcbbc00 r6:c1377c34 r5:60000153
[ 553.453313] r4:eddda000
[ 553.455896] [<c07dec24>] (schedule) from [<c07deff8>] (schedule_preempt_disabled+0x10/0x14)
[ 553.464261] r4:edd3c058 r3:0000000a
[ 553.467910] [<c07defe8>] (schedule_preempt_disabled) from [<c07e0bbc>] (mutex_lock_nested+0x1a0/0x3e8)
[ 553.477254] [<c07e0a1c>] (mutex_lock_nested) from [<c03e927c>] (dpm_complete+0xc0/0x1b0)
[ 553.485358] r10:00561408 r9:edd3c054 r8:c0b4863c r7:edddbd90 r6:c0b485d8 r5:edd3c020
[ 553.493313] r4:edd3c0d0
[ 553.495896] [<c03e91bc>] (dpm_complete) from [<c03e9388>] (dpm_resume_end+0x1c/0x20)
[ 553.503652] r9:00000000 r8:c0b1a9d0 r7:c1334ec0 r6:c1334edc r5:00000003 r4:00000010
[ 553.511544] [<c03e936c>] (dpm_resume_end) from [<c0079894>] (suspend_devices_and_enter+0x158/0x504)
[ 553.520604] r4:00000000 r3:c1334efc
[ 553.524250] [<c007973c>] (suspend_devices_and_enter) from [<c0079e74>] (pm_suspend+0x234/0x2cc)
[ 553.532961] r10:00561408 r9:ed6b7300 r8:00000004 r7:c1334eec r6:00000000 r5:c1334ee8
[ 553.540914] r4:00000003
[ 553.543493] [<c0079c40>] (pm_suspend) from [<c0078a6c>] (state_store+0x6c/0xc0)
[ 553.550815] r6:00000003 r5:c09b2ca4 r4:00000003 r3:0000006d
[ 553.556599] [<c0078a00>] (state_store) from [<c02e71fc>] (kobj_attr_store+0x1c/0x28)
[ 553.564358] r9:00000004 r8:c0010004 r7:edf9480c r6:ed6b7300 r5:edf94800 r4:00000004
[ 553.572258] [<c02e71e0>] (kobj_attr_store) from [<c01885d4>] (sysfs_kf_write+0x54/0x58)
[ 553.580295] [<c0188580>] (sysfs_kf_write) from [<c0187b4c>] (kernfs_fop_write+0xd8/0x1fc)
[ 553.588487] r6:ed6b7300 r5:00000000 r4:00000000 r3:c0188580
[ 553.594262] [<c0187a74>] (kernfs_fop_write) from [<c0114e98>] (__vfs_write+0x2c/0xe0)
[ 553.602105] r10:00000000 r9:eddda000 r8:c0010004 r7:edddbf80 r6:00561408 r5:edddbf80
[ 553.610060] r4:ed445280
[ 553.612641] [<c0114e6c>] (__vfs_write) from [<c0115d78>] (vfs_write+0x98/0x16c)
[ 553.619963] r8:c0010004 r7:edddbf80 r6:00561408 r5:00000004 r4:ed445280
[ 553.626800] [<c0115ce0>] (vfs_write) from [<c0116818>] (SyS_write+0x4c/0xa8)
[ 553.633861] r8:c0010004 r7:00561408 r6:00000004 r5:ed445280 r4:ed445280
[ 553.640705] [<c01167cc>] (SyS_write) from [<c000fe60>] (ret_fast_syscall+0x0/0x1c)
[ 553.648291] r7:00000004 r6:b6f27d60 r5:00561408 r4:00000004

> The call stack like below:
>
> [ 546.987379] writeback S c07de74c 0 12 2 0x00000000
> [ 546.993804] Backtrace:
> [ 546.996307] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
> [ 547.003370] r10:ef14bc80 r9:ef14ca00 r8:00000000 r7:c0045c90 r6:ef14bc80 r5:ef14bc98
> [ 547.011325] r4:ef164000
> [ 547.013907] [<c07dec24>] (schedule) from [<c0045f20>] (rescuer_thread+0x290/0x308)
> [ 547.021490] r4:00000000 r3:00000008
> [ 547.025136] [<c0045c90>] (rescuer_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
> [ 547.032459] r10:00000000 r9:00000000 r8:00000000 r7:c0045c90 r6:ef14bc80 r5:ef1526c0
> [ 547.040412] r4:00000000
> [ 547.042993] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
> [ 547.050229] r7:00000000 r6:00000000 r5:c004b9d8 r4:ef1526c0
> [ 555.178869] kworker/u2:13 D c07de74c 0 826 2 0x00000000
>
> [ 555.185310] Workqueue: ci_otg ci_otg_work
> [ 555.189353] Backtrace:
> [ 555.191849] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
> [ 555.198912] r10:ee471ba0 r9:00000000 r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4
> [ 555.206867] r4:ee470000
> [ 555.209453] [<c07dec24>] (schedule) from [<c07e2fc4>] (schedule_timeout+0x15c/0x1e0)
> [ 555.217212] r4:7fffffff r3:edc2b000
> [ 555.220862] [<c07e2e68>] (schedule_timeout) from [<c07df6c8>] (wait_for_common+0x94/0x144)
> [ 555.229140] r8:00000000 r7:00000002 r6:ee470000 r5:ee471ba4 r4:7fffffff
> [ 555.235980] [<c07df634>] (wait_for_common) from [<c07df790>] (wait_for_completion+0x18/0x1c)
> [ 555.244430] r10:00000001 r9:c0b5563c r8:c0042e48 r7:ef086000 r6:eea4372c r5:ef131b00
> [ 555.252383] r4:00000000
> [ 555.254970] [<c07df778>] (wait_for_completion) from [<c0043cb8>] (flush_work+0x19c/0x234)
> [ 555.263177] [<c0043b1c>] (flush_work) from [<c0043fac>] (flush_delayed_work+0x48/0x4c)
> [ 555.271106] r8:ed5b5000 r7:c0b38a3c r6:eea439cc r5:eea4372c r4:eea4372c
> [ 555.277958] [<c0043f64>] (flush_delayed_work) from [<c00eae18>] (bdi_unregister+0x84/0xec)
> [ 555.286236] r4:eea43520 r3:20000153
> [ 555.289885] [<c00ead94>] (bdi_unregister) from [<c02c2154>] (blk_cleanup_queue+0x180/0x29c)
> [ 555.298250] r5:eea43808 r4:eea43400
> [ 555.301909] [<c02c1fd4>] (blk_cleanup_queue) from [<c0417914>] (__scsi_remove_device+0x48/0xb8)
> [ 555.310623] r7:00000000 r6:20000153 r5:ededa950 r4:ededa800
> [ 555.316403] [<c04178cc>] (__scsi_remove_device) from [<c0415e90>] (scsi_forget_host+0x64/0x68)
> [ 555.325028] r5:ededa800 r4:ed5b5000
> [ 555.328689] [<c0415e2c>] (scsi_forget_host) from [<c0409828>] (scsi_remove_host+0x78/0x104)
> [ 555.337054] r5:ed5b5068 r4:ed5b5000
> [ 555.340709] [<c04097b0>] (scsi_remove_host) from [<c04cdfcc>] (usb_stor_disconnect+0x50/0xb4)
> [ 555.349247] r6:ed5b56e4 r5:ed5b5818 r4:ed5b5690 r3:00000008
> [ 555.355025] [<c04cdf7c>] (usb_stor_disconnect) from [<c04b3bc8>] (usb_unbind_interface+0x78/0x25c)
> [ 555.363997] r8:c13919b4 r7:edd3c000 r6:edd3c020 r5:ee551c68 r4:ee551c00 r3:c04cdf7c
> [ 555.371892] [<c04b3b50>] (usb_unbind_interface) from [<c03dc248>] (__device_release_driver+0x8c/0x118)
> [ 555.381213] r10:00000001 r9:edd90c00 r8:c13919b4 r7:ee551c68 r6:c0b546e0 r5:c0b5563c
> [ 555.389167] r4:edd3c020
> [ 555.391752] [<c03dc1bc>] (__device_release_driver) from [<c03dc2fc>] (device_release_driver+0x28/0x34)
> [ 555.401071] r5:edd3c020 r4:edd3c054
> [ 555.404721] [<c03dc2d4>] (device_release_driver) from [<c03db304>] (bus_remove_device+0xe0/0x110)
> [ 555.413607] r5:edd3c020 r4:ef17f04c
> [ 555.417253] [<c03db224>] (bus_remove_device) from [<c03d8128>] (device_del+0x114/0x21c)
> [ 555.425270] r6:edd3c028 r5:edd3c020 r4:ee551c00 r3:00000000
> [ 555.431045] [<c03d8014>] (device_del) from [<c04b1560>] (usb_disable_device+0xa4/0x1e8)
> [ 555.439061] r8:edd3c000 r7:eded8000 r6:00000000 r5:00000001 r4:ee551c00
> [ 555.445906] [<c04b14bc>] (usb_disable_device) from [<c04a8e54>] (usb_disconnect+0x74/0x224)
> [ 555.454271] r9:edd90c00 r8:ee551000 r7:ee551c68 r6:ee551c9c r5:ee551c00 r4:00000001
> [ 555.462156] [<c04a8de0>] (usb_disconnect) from [<c04a8fb8>] (usb_disconnect+0x1d8/0x224)
> [ 555.470259] r10:00000001 r9:edd90000 r8:ee471e2c r7:ee551468 r6:ee55149c r5:ee551400
> [ 555.478213] r4:00000001
> [ 555.480797] [<c04a8de0>] (usb_disconnect) from [<c04ae5ec>] (usb_remove_hcd+0xa0/0x1ac)
> [ 555.488813] r10:00000001 r9:ee471eb0 r8:00000000 r7:ef3d9500 r6:eded810c r5:eded80b0
> [ 555.496765] r4:eded8000
> [ 555.499351] [<c04ae54c>] (usb_remove_hcd) from [<c04d4158>] (host_stop+0x28/0x64)
> [ 555.506847] r6:eeb50010 r5:eded8000 r4:eeb51010
> [ 555.511563] [<c04d4130>] (host_stop) from [<c04d09b8>] (ci_otg_work+0xc4/0x124)
> [ 555.518885] r6:00000001 r5:eeb50010 r4:eeb502a0 r3:c04d4130
> [ 555.524665] [<c04d08f4>] (ci_otg_work) from [<c00454f0>] (process_one_work+0x194/0x420)
> [ 555.532682] r6:ef086000 r5:eeb502a0 r4:edc44480
> [ 555.537393] [<c004535c>] (process_one_work) from [<c00457b0>] (worker_thread+0x34/0x514)
> [ 555.545496] r10:edc44480 r9:ef086000 r8:c0b1a100 r7:ef086034 r6:00000088 r5:edc44498
> [ 555.553450] r4:ef086000
> [ 555.556032] [<c004577c>] (worker_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
> [ 555.563268] r10:00000000 r9:00000000 r8:00000000 r7:c004577c r6:edc44480 r5:eddc15c0
> [ 555.571221] r4:00000000
> [ 555.573804] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
> [ 555.581040] r7:00000000 r6:00000000 r5:c004b9d8 r4:eddc15c0
> [ 555.586803] kworker/u2:14 S c07de74c 0 827 2 0x00000000
> [ 555.593232] Backtrace:
> [ 555.595739] [<c07de4fc>] (__schedule) from [<c07dec6c>] (schedule+0x48/0xa0)
> [ 555.602806] r10:edc44200 r9:00000000 r8:00000000 r7:ef086034 r6:edc44200 r5:ef086000
> [ 555.610761] r4:ed72a000
> [ 555.613345] [<c07dec24>] (schedule) from [<c0045834>] (worker_thread+0xb8/0x514)
> [ 555.620755] r4:ef086000 r3:ed72bef0
> [ 555.624399] [<c004577c>] (worker_thread) from [<c004bab4>] (kthread+0xdc/0xf8)
> [ 555.631635] r10:00000000 r9:00000000 r8:00000000 r7:c004577c r6:edc44200 r5:eddc15c0
> [ 555.639589] r4:00000000
> [ 555.642170] [<c004b9d8>] (kthread) from [<c000fef0>] (ret_from_fork+0x14/0x24)
> [ 555.649406] r7:00000000 r6:00000000 r5:c004b9d8 r4:eddc15c0
> [ 555.655168]
> [ 555.655168] Showing all locks held in the system:
> [ 555.661424] 5 locks held by sh/694:
> [ 555.664926] #0: (sb_writers#6){.+.+.+}, at: [<c0118f0c>] __sb_start_write+0xb0/0xbc
> [ 555.672921] #1: (&of->mutex){+.+.+.}, at: [<c0187ad4>] kernfs_fop_write+0x60/0x1fc
> [ 555.680805] #2: (s_active#211){.+.+.+}, at: [<c0187adc>] kernfs_fop_write+0x68/0x1fc
> [ 555.688873] #3: (pm_mutex){+.+.+.}, at: [<c0079d10>] pm_suspend+0xd0/0x2cc
> [ 555.696052] #4: (&dev->mutex){......}, at: [<c03e927c>] dpm_complete+0xc0/0x1b0
> [ 555.703684] 7 locks held by kworker/u2:13/826:
> [ 555.708140] #0: ("%s""ci_otg"){++++.+}, at: [<c0045484>] process_one_work+0x128/0x420
> [ 555.716277] #1: ((&ci->work)){+.+.+.}, at: [<c0045484>] process_one_work+0x128/0x420
> [ 555.724317] #2: (usb_bus_list_lock){+.+.+.}, at: [<c04ae5e4>] usb_remove_hcd+0x98/0x1ac
> [ 555.732626] #3: (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
> [ 555.740403] #4: (&dev->mutex){......}, at: [<c04a8e28>] usb_disconnect+0x48/0x224
> [ 555.748179] #5: (&dev->mutex){......}, at: [<c03dc2f4>] device_release_driver+0x20/0x34
> [ 555.756487] #6: (&shost->scan_mutex){+.+.+.}, at: [<c04097d0>] scsi_remove_host+0x20/0x104
> [ 555.765062]
> [ 555.766567] =============================================
> [ 555.766567]
> [ 555.773467] Showing busy workqueues and worker pools:
> [ 555.778544] workqueue events_freezable: flags=0x4
> [ 555.783371] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
> [ 555.789316] delayed: thermal_zone_device_check
> [ 555.794191] workqueue pm: flags=0x4
> [ 555.797700] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
> [ 555.803636] delayed: pm_runtime_work
> [ 555.807629] workqueue writeback: flags=0x4e
> [ 555.811894] pwq 2: cpus=0 flags=0x4 nice=0 active=0/0
> [ 555.817209] delayed: wb_workfn BAR(826)
> [ 555.821493] workqueue usb_hub_wq: flags=0x4
> [ 555.825693] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=0/0
> [ 555.831627] delayed: hub_event
> [ 555.835119] workqueue ci_otg: flags=0x2000a
> [ 555.839320] pwq 2: cpus=0 flags=0x4 nice=0 active=1/1
> [ 555.844623] in-flight: 826:ci_otg_work
> [ 555.848810] pool 2: cpus=0 flags=0x4 nice=0 hung=0s workers=15 idle: 814 733 700 824 6 822 820 819 816 817 815 813 827 825
>

--

Best Regards,
Peter Chen