Re: [PATCH] Hotplug: fix the bug that the system is down,when memory is not in node0 and cpu is logically hotadded.

From: Tejun Heo
Date: Fri May 08 2015 - 11:23:19 EST


Cc'ing Lai, Gu and Kamezawa as they've been working in the area for a
while now. Gu, is this related to what you've been working on?

Thanks.

On Fri, May 08, 2015 at 07:16:40PM +0800, Song Xiumiao wrote:
> From: songxiumiao <songxiumiao@xxxxxxxxxx>
>
> By analysing the bug function call trace,we find that create_worker
> function will alloc the memory from node0.Because node0 is offline,
> the allocation is failed. Then we add a condition to ensure the node
> is online and system can alloc memory from a node that is online.
>
> Follow is the bug information:
> [root@localhost ~]# echo 1 > /sys/devices/system/cpu/cpu90/online
> [ 225.611209] smpboot: Booting Node 2 Processor 90 APIC 0x40
> [18446744029.482996] kvm: enabling virtualization on CPU90
> [ 225.725503] TSC synchronization [CPU#43 -> CPU#90]:
> [ 225.730952] Measured 672516581900 cycles TSC warp between CPUs, turning off TSC clock.
> [ 225.739800] tsc: Marking TSC unstable due to check_tsc_sync_source failed
> [ 225.755126] BUG: unable to handle kernel paging request at 0000000000001b08
> [ 225.762931] IP: [<ffffffff81182597>] __alloc_pages_nodemask+0xb7/0x940
> [ 225.770247] PGD 449bb0067 PUD 46110e067 PMD 0
> [ 225.775248] Oops: 0000 [#1] SMP
> [ 225.778875] Modules linked in: xt_CHECKSUM ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntracd
> [ 225.868198] CPU: 43 PID: 5400 Comm: bash Not tainted 4.0.0-rc4-bug-fixed-remove #16
> [ 225.876754] Hardware name: Insyde Brickland/Type2 - Board Product Name1, BIOS Brickland.05.04.15.0024 02/28/2015
> [ 225.888122] task: ffff88045a3d8da0 ti: ffff880446120000 task.ti: ffff880446120000
> [ 225.896484] RIP: 0010:[<ffffffff81182597>] [<ffffffff81182597>] __alloc_pages_nodemask+0xb7/0x940
> [ 225.906509] RSP: 0018:ffff880446123918 EFLAGS: 00010246
> [ 225.912443] RAX: 0000000000001b00 RBX: 0000000000000010 RCX: 0000000000000000
> [ 225.920416] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000002052d0
> [ 225.928388] RBP: ffff880446123a08 R08: ffff880460eca0c0 R09: 0000000060eca101
> [ 225.936361] R10: ffff88046d007300 R11: ffffffff8108dd31 R12: 000000000001002a
> [ 225.944334] R13: 00000000002052d0 R14: 0000000000000001 R15: 00000000000040d0
> [ 225.952306] FS: 00007f9386450740(0000) GS:ffff88046db60000(0000) knlGS:0000000000000000
> [ 225.961346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 225.967765] CR2: 0000000000001b08 CR3: 00000004612a3000 CR4: 00000000001407e0
> [ 225.975735] Stack:
> [ 225.977981] 00000000002052d0 0000000000000000 0000000000000003 ffff88045a3d8da0
> [ 225.986291] ffff880446123988 ffffffff811c7f81 ffff88045a3d8da0 0000000000000000
> [ 225.994597] 000080d000000002 ffff88046d005500 000000000003000f 002052d0002052d0
> [ 226.002904] Call Trace:
> [ 226.005645] [<ffffffff811c7f81>] ? alloc_pages_current+0x91/0x100
> [ 226.012557] [<ffffffff811d27c3>] ? deactivate_slab+0x383/0x400
> [ 226.019173] [<ffffffff811d3957>] new_slab+0xa7/0x460
> [ 226.024826] [<ffffffff81678c75>] __slab_alloc+0x310/0x470
> [ 226.030960] [<ffffffff8130caf6>] ? get_from_free_list+0x46/0x60
> [ 226.037679] [<ffffffff8108dd31>] ? alloc_worker+0x21/0x50
> [ 226.043812] [<ffffffff811d46c1>] kmem_cache_alloc_node_trace+0x91/0x250
> [ 226.051299] [<ffffffff8108dd31>] alloc_worker+0x21/0x50
> [ 226.057236] [<ffffffff8108ff23>] create_worker+0x53/0x1e0
> [ 226.063357] [<ffffffff81092092>] alloc_unbound_pwq+0x2a2/0x510
> [ 226.069974] [<ffffffff810924b4>] wq_update_unbound_numa+0x1b4/0x220
> [ 226.077076] [<ffffffff81092828>] workqueue_cpu_up_callback+0x308/0x3d0
> [ 226.084468] [<ffffffff8109784e>] notifier_call_chain+0x4e/0x80
> [ 226.091084] [<ffffffff8109796e>] __raw_notifier_call_chain+0xe/0x10
> [ 226.098189] [<ffffffff810774f3>] cpu_notify+0x23/0x50
> [ 226.103929] [<ffffffff81077878>] _cpu_up+0x188/0x1a0
> [ 226.109574] [<ffffffff81077919>] cpu_up+0x89/0xb0
> [ 226.114923] [<ffffffff8166fba0>] cpu_subsys_online+0x40/0x90
> [ 226.121350] [<ffffffff814386dd>] device_online+0x6d/0xa0
> [ 226.127382] [<ffffffff814387a5>] online_store+0x95/0xa0
> [ 226.133322] [<ffffffff814358a8>] dev_attr_store+0x18/0x30
> [ 226.139457] [<ffffffff8126d76d>] sysfs_kf_write+0x3d/0x50
> [ 226.145586] [<ffffffff8126cc1a>] kernfs_fop_write+0x12a/0x180
> [ 226.152109] [<ffffffff811f1bb7>] vfs_write+0xb7/0x1f0
> [ 226.157853] [<ffffffff810232bc>] ? do_audit_syscall_entry+0x6c/0x70
> [ 226.164954] [<ffffffff811f2835>] SyS_write+0x55/0xd0
> [ 226.170595] [<ffffffff81681f09>] system_call_fastpath+0x12/0x17
> [ 226.177306] Code: 30 97 00 89 45 bc 83 e1 0f b8 22 01 32 01 01 c9 d3 f8 83 e0 03 89 9d 6c ff ff ff 83 e3 10 89 45 c0 0f 85 6d 01 00 00 48 8b 45 88 <48> 83 78 08 00 0f 84 51 01 00 00 b8 01
> [ 226.199175] RIP [<ffffffff81182597>] __alloc_pages_nodemask+0xb7/0x940
> [ 226.206576] RSP <ffff880446123918>
> [ 226.210471] CR2: 0000000000001b08
> [ 226.227939] ---[ end trace 30d753e1e1124696 ]---
> [ 226.412591] Kernel panic - not syncing: Fatal exception
> [ 226.430948] Kernel Offset: disabled
> [ 226.434845] drm_kms_helper: panic occurred, switching back to text console
> [ 226.618325] ---[ end Kernel panic - not syncing: Fatal exception
> [ 226.625047] ------------[ cut here ]------------
> [ 226.630213] WARNING: CPU: 43 PID: 5400 at arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x5d/0x60()
> [ 226.640999] Modules linked in: xt_CHECKSUM ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntracd
> [ 226.730275] CPU: 43 PID: 5400 Comm: bash Tainted: G D 4.0.0-rc4-bug-fixed-remove #16
> [ 226.740189] Hardware name: Insyde Brickland/Type2 - Board Product Name1, BIOS Brickland.05.04.15.0024 02/28/2015
> [ 226.751558] 0000000000000000 00000000aa535e80 ffff88046db63d58 ffffffff8167aa08
> [ 226.759865] 0000000000000000 0000000000000000 ffff88046db63d98 ffffffff810772da
> [ 226.768173] ffff88046db63d98 0000000000000000 ffff88046d615380 000000000000002b
> [ 226.776480] Call Trace:
> [ 226.779212] <IRQ> [<ffffffff8167aa08>] dump_stack+0x45/0x57
> [ 226.785657] [<ffffffff810772da>] warn_slowpath_common+0x8a/0xc0
> [ 226.792367] [<ffffffff8107740a>] warn_slowpath_null+0x1a/0x20
> [ 226.798886] [<ffffffff8104a64d>] native_smp_send_reschedule+0x5d/0x60
> [ 226.806182] [<ffffffff810b4fe5>] trigger_load_balance+0x145/0x1b0
> [ 226.813093] [<ffffffff810a348c>] scheduler_tick+0x9c/0xe0
> [ 226.819228] [<ffffffff810e0a21>] update_process_times+0x51/0x60
> [ 226.825946] [<ffffffff810f0925>] tick_sched_handle.isra.18+0x25/0x60
> [ 226.833143] [<ffffffff810f09a4>] tick_sched_timer+0x44/0x80
> [ 226.839467] [<ffffffff810e1737>] __run_hrtimer+0x77/0x1d0
> [ 226.845590] [<ffffffff810f0960>] ? tick_sched_handle.isra.18+0x60/0x60
> [ 226.852980] [<ffffffff810e1b13>] hrtimer_interrupt+0x103/0x230
> [ 226.859596] [<ffffffff8104d3d9>] local_apic_timer_interrupt+0x39/0x60
> [ 226.866883] [<ffffffff81684d85>] smp_apic_timer_interrupt+0x45/0x60
> [ 226.873982] [<ffffffff81682ded>] apic_timer_interrupt+0x6d/0x80
> [ 226.880690] <EOI> [<ffffffff81675abe>] ? panic+0x1c3/0x204
> [ 226.887036] [<ffffffff81675ab7>] ? panic+0x1bc/0x204
> [ 226.892682] [<ffffffff81018949>] oops_end+0x109/0x120
> [ 226.898422] [<ffffffff81675285>] no_context+0x2ee/0x366
> [ 226.904359] [<ffffffff81675370>] __bad_area_nosemaphore+0x73/0x1cc
> [ 226.911361] [<ffffffff816756ae>] bad_area+0x44/0x4c
> [ 226.916910] [<ffffffff81062b1a>] __do_page_fault+0x2ea/0x420
> [ 226.923331] [<ffffffff81062c81>] do_page_fault+0x31/0x70
> [ 226.929364] [<ffffffff81683f08>] page_fault+0x28/0x30
> [ 226.935106] [<ffffffff8108dd31>] ? alloc_worker+0x21/0x50
> [ 226.941235] [<ffffffff81182597>] ? __alloc_pages_nodemask+0xb7/0x940
> [ 226.948430] [<ffffffff81182705>] ? __alloc_pages_nodemask+0x225/0x940
> [ 226.955725] [<ffffffff811c7f81>] ? alloc_pages_current+0x91/0x100
> [ 226.962624] [<ffffffff811d27c3>] ? deactivate_slab+0x383/0x400
> [ 226.969239] [<ffffffff811d3957>] new_slab+0xa7/0x460
> [ 226.974885] [<ffffffff81678c75>] __slab_alloc+0x310/0x470
> [ 226.981015] [<ffffffff8130caf6>] ? get_from_free_list+0x46/0x60
> [ 226.987727] [<ffffffff8108dd31>] ? alloc_worker+0x21/0x50
> [ 226.993851] [<ffffffff811d46c1>] kmem_cache_alloc_node_trace+0x91/0x250
> [ 227.001340] [<ffffffff8108dd31>] alloc_worker+0x21/0x50
> [ 227.007275] [<ffffffff8108ff23>] create_worker+0x53/0x1e0
> [ 227.013404] [<ffffffff81092092>] alloc_unbound_pwq+0x2a2/0x510
> [ 227.020019] [<ffffffff810924b4>] wq_update_unbound_numa+0x1b4/0x220
> [ 227.027112] [<ffffffff81092828>] workqueue_cpu_up_callback+0x308/0x3d0
> [ 227.034502] [<ffffffff8109784e>] notifier_call_chain+0x4e/0x80
> [ 227.041117] [<ffffffff8109796e>] __raw_notifier_call_chain+0xe/0x10
> [ 227.048219] [<ffffffff810774f3>] cpu_notify+0x23/0x50
> [ 227.053961] [<ffffffff81077878>] _cpu_up+0x188/0x1a0
> [ 227.059597] [<ffffffff81077919>] cpu_up+0x89/0xb0
> [ 227.064950] [<ffffffff8166fba0>] cpu_subsys_online+0x40/0x90
> [ 227.071372] [<ffffffff814386dd>] device_online+0x6d/0xa0
> [ 227.077395] [<ffffffff814387a5>] online_store+0x95/0xa0
> [ 227.083332] [<ffffffff814358a8>] dev_attr_store+0x18/0x30
> [ 227.089460] [<ffffffff8126d76d>] sysfs_kf_write+0x3d/0x50
> [ 227.095589] [<ffffffff8126cc1a>] kernfs_fop_write+0x12a/0x180
> [ 227.102108] [<ffffffff811f1bb7>] vfs_write+0xb7/0x1f0
> [ 227.107850] [<ffffffff810232bc>] ? do_audit_syscall_entry+0x6c/0x70
> [ 227.114950] [<ffffffff811f2835>] SyS_write+0x55/0xd0
> [ 227.120595] [<ffffffff81681f09>] system_call_fastpath+0x12/0x17
> [ 227.127306] ---[ end trace 30d753e1e1124697 ]---
>
> Signed-off-by: Song Xiumiao <songxiumiao@xxxxxxxxxx>
> Signed-off-by: Gong Zhaogang <gongzhaogang@xxxxxxxxxx>
> Tested-by: Liu Changsheng <liuchangsheng@xxxxxxxxxx>
> Reviewed-by: xiaofeng.yan <xiaofeng.yan@xxxxxxxxxx>
> Reviewed-by: Fan Dongdong <fandd@xxxxxxxxxx>
> ---
> kernel/workqueue.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 586ad91..cae6277 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3253,7 +3253,8 @@ static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs)
> if (wq_numa_enabled) {
> for_each_node(node) {
> if (cpumask_subset(pool->attrs->cpumask,
> - wq_numa_possible_cpumask[node])) {
> + wq_numa_possible_cpumask[node]) &&
> + node_online(node)) {
> pool->node = node;
> break;
> }
> --
> 1.9.1
>
>

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/