system hung up when offlining CPUs

From: YASUAKI ISHIMATSU
Date: Tue Aug 08 2017 - 15:24:44 EST


Hi Thomas,

When offlining all CPUs except cpu0, system hung up with the following message.

[...] INFO: task kworker/u384:1:1234 blocked for more than 120 seconds.
[...] Not tainted 4.12.0-rc6+ #19
[...] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[...] kworker/u384:1 D 0 1234 2 0x00000000
[...] Workqueue: writeback wb_workfn (flush-253:0)
[...] Call Trace:
[...] __schedule+0x28a/0x880
[...] schedule+0x36/0x80
[...] schedule_timeout+0x249/0x300
[...] ? __schedule+0x292/0x880
[...] __down_common+0xfc/0x132
[...] ? _xfs_buf_find+0x2bb/0x510 [xfs]
[...] __down+0x1d/0x1f
[...] down+0x41/0x50
[...] xfs_buf_lock+0x3c/0xf0 [xfs]
[...] _xfs_buf_find+0x2bb/0x510 [xfs]
[...] xfs_buf_get_map+0x2a/0x280 [xfs]
[...] xfs_buf_read_map+0x2d/0x180 [xfs]
[...] xfs_trans_read_buf_map+0xf5/0x310 [xfs]
[...] xfs_btree_read_buf_block.constprop.35+0x78/0xc0 [xfs]
[...] xfs_btree_lookup_get_block+0x88/0x160 [xfs]
[...] xfs_btree_lookup+0xd0/0x3b0 [xfs]
[...] ? xfs_allocbt_init_cursor+0x41/0xe0 [xfs]
[...] xfs_alloc_ag_vextent_near+0xaf/0xaa0 [xfs]
[...] xfs_alloc_ag_vextent+0x13c/0x150 [xfs]
[...] xfs_alloc_vextent+0x425/0x590 [xfs]
[...] xfs_bmap_btalloc+0x448/0x770 [xfs]
[...] xfs_bmap_alloc+0xe/0x10 [xfs]
[...] xfs_bmapi_write+0x61d/0xc10 [xfs]
[...] ? kmem_zone_alloc+0x96/0x100 [xfs]
[...] xfs_iomap_write_allocate+0x199/0x3a0 [xfs]
[...] xfs_map_blocks+0x1e8/0x260 [xfs]
[...] xfs_do_writepage+0x1ca/0x680 [xfs]
[...] write_cache_pages+0x26f/0x510
[...] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
[...] ? blk_mq_dispatch_rq_list+0x305/0x410
[...] ? deadline_remove_request+0x7d/0xc0
[...] xfs_vm_writepages+0xb6/0xd0 [xfs]
[...] do_writepages+0x1c/0x70
[...] __writeback_single_inode+0x45/0x320
[...] writeback_sb_inodes+0x280/0x570
[...] __writeback_inodes_wb+0x8c/0xc0
[...] wb_writeback+0x276/0x310
[...] ? get_nr_dirty_inodes+0x4d/0x80
[...] wb_workfn+0x2d4/0x3b0
[...] process_one_work+0x149/0x360
[...] worker_thread+0x4d/0x3c0
[...] kthread+0x109/0x140
[...] ? rescuer_thread+0x380/0x380
[...] ? kthread_park+0x60/0x60
[...] ret_from_fork+0x25/0x30


I bisected upstream kernel. And I found that the following commit lead
the issue.

commit c5cb83bb337c25caae995d992d1cdf9b317f83de
Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Date: Tue Jun 20 01:37:51 2017 +0200

genirq/cpuhotplug: Handle managed IRQs on CPU hotplug


Thanks,
Yasuaki Ishimatsu