[Regression] fstrim hangs on Hyper-V: caused by "block: improve handling of the magic discard payload"
From: Dexuan Cui
Date: Thu Jan 12 2017 - 05:56:27 EST
Hi,
Recently fstrim and mkfs always hang in Linux VM running on Hyper-V 2012 R2 or 2016.
The VM uses the latest mainline kernel (v4.10-rc3).
git-bisect shows the patch
"block: improve handling of the magic discard payload (f9d03f96)"
causes the issue.
If I revert the patch, the issue will go away.
When the issue happens, any new shell command causing disk I/O will hang too, and
I even can't reboot the VM due to the pending I/O.
It seems blkdev_issue_discard() never returns, meaning the SCSI Unmap command(s)
can't finish somehow, I think.
Any idea why the patch can cause this?
Thanks!
-- Dexuan
PS, this is the calltrace:
[ 1450.976205] INFO: task fstrim:1300 blocked for more than 120 seconds.
[ 1450.976264] Not tainted 4.9.0+ #58
[ 1450.976291] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1450.976342] fstrim D 0 1300 1280 0x00000000
[ 1450.976382] Call Trace:
[ 1450.976412] ? __schedule+0x232/0x700
[ 1450.976442] ? try_to_grab_pending+0xb3/0x160
[ 1450.976476] schedule+0x36/0x80
[ 1450.976501] schedule_timeout+0x235/0x3f0
[ 1450.976532] ? blk_run_queue_async+0x3c/0x40
[ 1450.976565] io_schedule_timeout+0xa4/0x110
[ 1450.976596] wait_for_completion_io+0xa5/0x110
[ 1450.976628] ? wake_up_q+0x70/0x70
[ 1450.976654] submit_bio_wait+0x59/0x70
[ 1450.976683] blkdev_issue_discard+0x6a/0xb0
[ 1450.976783] xfs_trim_extents+0x24c/0x410 [xfs]
[ 1450.976862] xfs_ioc_trim+0x157/0x1c0 [xfs]
[ 1450.976938] xfs_file_ioctl+0x8ee/0xb20 [xfs]
[ 1450.976972] ? path_openat+0x3fb/0x13f0
[ 1450.977002] ? page_add_file_rmap+0x58/0x140
[ 1450.977035] ? alloc_set_pte+0x4ee/0x640
[ 1450.977065] ? do_filp_open+0x92/0xe0
[ 1450.977093] ? _copy_to_user+0x2e/0x40
[ 1450.977121] ? cp_new_stat+0x141/0x160
[ 1450.977151] do_vfs_ioctl+0x92/0x5a0
[ 1450.977178] ? SYSC_newfstat+0x25/0x30
[ 1450.977206] SyS_ioctl+0x79/0x90
[ 1450.977232] entry_SYSCALL_64_fastpath+0x1e/0xad
[ 1450.977264] RIP: 0033:0x7f8cac393687
[ 1450.977290] RSP: 002b:00007ffdce06fa38 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[ 1450.977340] RAX: ffffffffffffffda RBX: 0000000000609330 RCX: 00007f8cac393687
[ 1450.977386] RDX: 00007ffdce06fa40 RSI: 00000000c0185879 RDI: 0000000000000003
[ 1450.977431] RBP: 00007ffdce06fd18 R08: 0000000000000000 R09: 0000000000000000
[ 1450.977476] R10: 000000000000053f R11: 0000000000000202 R12: 0000000000000000
[ 1450.977522] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1450.977570] INFO: task ls:1304 blocked for more than 120 seconds.
[ 1450.977609] Not tainted 4.9.0+ #58
[ 1450.977636] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1450.977685] ls D 0 1304 1219 0x00000000
[ 1450.977723] Call Trace:
[ 1450.977745] ? __schedule+0x232/0x700
[ 1450.977774] ? __blk_run_queue+0x33/0x40
[ 1450.977803] ? queue_unplugged+0x2a/0xb0
[ 1450.977833] schedule+0x36/0x80
[ 1450.977857] schedule_timeout+0x235/0x3f0
[ 1450.977886] ? blk_finish_plug+0x2c/0x40
[ 1450.977963] ? _xfs_buf_ioapply+0x324/0x440 [xfs]
[ 1450.977998] wait_for_completion+0xa5/0x110
[ 1450.978028] ? wake_up_q+0x70/0x70
[ 1450.978107] ? xfs_trans_read_buf_map+0xf5/0x330 [xfs]
[ 1450.979283] ? _xfs_buf_read+0x23/0x30 [xfs]
[ 1450.980522] xfs_buf_submit_wait+0x7f/0x210 [xfs]
[ 1450.981706] ? xfs_trans_read_buf_map+0xf5/0x330 [xfs]
[ 1450.982863] _xfs_buf_read+0x23/0x30 [xfs]
[ 1450.984420] xfs_buf_read_map+0x108/0x180 [xfs]
[ 1450.985559] xfs_trans_read_buf_map+0xf5/0x330 [xfs]
[ 1450.986672] xfs_imap_to_bp+0x5f/0xc0 [xfs]
[ 1450.987761] xfs_iread+0x79/0x320 [xfs]
[ 1450.988894] xfs_iget+0x32a/0x840 [xfs]
[ 1450.990055] xfs_lookup+0xc6/0xe0 [xfs]
[ 1450.991132] xfs_vn_lookup+0x4f/0x90 [xfs]
[ 1450.992221] lookup_slow+0x96/0x140
[ 1450.993254] walk_component+0x1ca/0x2f0
[ 1450.994283] ? path_init+0x1d9/0x330
[ 1450.995309] ? mntput+0x24/0x40
[ 1450.996955] path_lookupat+0x5d/0x110
[ 1450.997979] filename_lookup+0x9e/0x150
[ 1450.999001] ? kmem_cache_alloc+0xd7/0x1b0
[ 1451.000126] ? getname_flags+0x56/0x1f0
[ 1451.001150] ? getname_flags+0x72/0x1f0
[ 1451.002164] user_path_at_empty+0x36/0x40
[ 1451.003173] vfs_fstatat+0x53/0xa0
[ 1451.004223] SYSC_newlstat+0x22/0x40
[ 1451.005232] SyS_newlstat+0xe/0x10
[ 1451.006233] entry_SYSCALL_64_fastpath+0x1e/0xad
[ 1451.007750] RIP: 0033:0x7ff2730993d5
[ 1451.008820] RSP: 002b:00007ffc7c1650c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000006
[ 1451.009880] RAX: ffffffffffffffda RBX: 00007ff273366b78 RCX: 00007ff2730993d5
[ 1451.010953] RDX: 00000000019dfb20 RSI: 00000000019dfb20 RDI: 00007ffc7c1650d0
[ 1451.012078] RBP: 00007ff273366b20 R08: 0000000000000000 R09: 00000000000000c0
[ 1451.013175] R10: 00000000019e4550 R11: 0000000000000246 R12: 0000000000008041
[ 1451.014260] R13: 00007ff273366b78 R14: 000000000000270f R15: 00007ff273366b78