RE: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements

From: Dexuan Cui
Date: Mon Feb 06 2017 - 22:49:03 EST


> From: Bart Van Assche [mailto:Bart.VanAssche@xxxxxxxxxxx]
>
> On Tue, 2017-02-07 at 02:23 +0000, Dexuan Cui wrote:
> > Any news on this thread?
> >
> > The issue is still blocking Linux from booting up normally in my test. :-(
> >
> > Have we identified the faulty patch?
> > If so, at least I can try to revert it to boot up.
>
> It's interesting that you have a reproducible testcase. If you can tell me how to
> reproduce this I'll have a look at it together with Hannes.
>
> Bart.

I'm running a Ubuntu 16.04 guest on Hyper-V with the guest kernel replaced
with the linux-next kernel.

I can boot the guest with linux-next's next-20170130 without any issue,
but since next-20170131 I haven't succeeded in booting the guest.

With next-20170203 (mentioned in my mail last Friday), I got the same
calltrace as Hannes.

With today's linux-next (next-20170206), actually the calltrace changed to
the below:

(Please see the attached files for the kernel config and the full kernel log.)
(I applied Hannes's patch in this thread, but the situation remained the same.)

[ 121.824158] INFO: task systemd-udevd:91 blocked for more than 60 seconds.
[ 121.854885] Not tainted 4.10.0-rc6-next-20170206+ #1
[ 121.885004] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 121.927618] systemd-udevd D12816 91 86 0x00000000
[ 121.952912] Call Trace:
[ 121.964366] __schedule+0x2a9/0x900
[ 121.979931] schedule+0x36/0x80
[ 121.995288] async_synchronize_cookie_domain+0x91/0x130
[ 122.023036] ? remove_wait_queue+0x70/0x70
[ 122.051383] async_synchronize_full+0x17/0x20
[ 122.076925] do_init_module+0xc1/0x1f9
[ 122.097530] load_module+0x24bc/0x2980
[ 122.118418] ? ref_module+0x1c0/0x1c0
[ 122.139060] SYSC_finit_module+0xbc/0xf0
[ 122.161566] SyS_finit_module+0xe/0x10
[ 122.185397] entry_SYSCALL_64_fastpath+0x1e/0xb2
[ 122.221880] RIP: 0033:0x7f1d69105c19
[ 122.248526] RSP: 002b:00007ffe34dc3928 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 122.283349] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f1d69105c19
[ 122.315987] RDX: 0000000000000000 RSI: 00007f1d695fbe2a RDI: 000000000000000c
[ 122.354369] RBP: 00007ffe34dc2930 R08: 0000000000000000 R09: 0000000000000000
[ 122.407496] R10: 000000000000000c R11: 0000000000000246 R12: 000055f0b9b910a0
[ 122.443667] R13: 00007ffe34dc2910 R14: 0000000000000005 R15: 000000000aba9500
[ 122.475741]
[ 122.475741] Showing all locks held in the system:
[ 122.503742] 2 locks held by khungtaskd/17:
[ 122.524260] #0: (rcu_read_lock){......}, at: [<ffffffff9a10d5f1>] watchdog+0xa1/0x3d0
[ 122.569110] #1: (tasklist_lock){......}, at: [<ffffffff9a0aaf8d>] debug_show_all_locks+0x3d/0x1a0
[ 122.623903] 2 locks held by kworker/u128:1/61:
[ 122.654030] #0: ("events_unbound"){......}, at: [<ffffffff9a079035>] process_one_work+0x175/0x540
[ 122.710469] #1: ((&entry->work)){......}, at: [<ffffffff9a079035>] process_one_work+0x175/0x540
[ 122.770659]

Thanks,
-- Dexuan

Attachment: kernel.config.zip
Description: kernel.config.zip

Attachment: putty.log.zip
Description: putty.log.zip