Re: [PATCH] workqueue: Ensure that cpumask set for pools created after boot

From: Michael Bringmann
Date: Wed May 24 2017 - 19:40:14 EST

Next message: Tim Chen: "Re: [PATCH v2 for-4.12-fixes 1/2] sched/fair: Use task_groups instead of leaf_cfs_rq_list to walk all cfs_rqs"
Previous message: David Daney: "[PATCH] test_bpf: Add a couple of tests for BPF_JSGE."
In reply to: Michael Bringmann: "Re: [PATCH] workqueue: Ensure that cpumask set for pools created after boot"
Next in thread: Tejun Heo: "Re: [PATCH] workqueue: Ensure that cpumask set for pools created after boot"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 05/23/2017 03:10 PM, Tejun Heo wrote:
> Hello,
>
> On Tue, May 23, 2017 at 03:09:07PM -0500, Michael Bringmann wrote:
>> To confirm, you want the WARN_ON(cpumask_any(pool->attrs->cpumask) >= NR_CPUS)
>> at the point where I place my current patch?
>
> Yeah, cpumask_weight() probably is a bit more intuitive but I'm
> curious why we're creating workqueues for a node before cpus come
> online.
>
> Thanks.
>

And here is the full WARN_ONCE(!cpumask(pool->attrs->cpumask), "message")
following by the crash in kernel/sched/core.c, as I removed the patch as
well to demonstrate what would happen in the 4.12 kernel on powerpc.

1) Boot with Shared CPUs (2 CPUs, 8VPs) / Shared Memory (20G)
2) numactl -H
3) Hot-add 16 VPs to system, and run 'numactl -H'
4) Hot-remove 16 VPs from system. Hit WARN_ONCE message in
workqueue.c:get_unbound_pool(), followed by the crash in
kernel/sched/core.c, as I also removed the patch that 'fixed'
the cpumask.

-----------------------Beginning of Log-------------------------------------

Red Hat Enterprise Linux Server 7.3 (Maipo)
Kernel 4.12.0-rc1.wi91275_debug_03.ppc64le+ on an ppc64le

ltcalpine2-lp20 login: root
Password:
Last login: Wed May 24 18:45:40 from oc1554177480.austin.ibm.com
[root@ltcalpine2-lp20 ~]# numactl -H
available: 2 nodes (0,6)
node 0 cpus:
node 0 size: 0 MB
node 0 free: 0 MB
node 6 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
node 6 size: 19858 MB
node 6 free: 16920 MB
node distances:
node 0 6
0: 10 40
6: 40 10
[root@ltcalpine2-lp20 ~]# numactl -H
available: 2 nodes (0,6)
node 0 cpus:
node 0 size: 0 MB
node 0 free: 0 MB
node 6 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191
node 6 size: 19858 MB
node 6 free: 16362 MB
node distances:
node 0 6
0: 10 40
6: 40 10
[root@ltcalpine2-lp20 ~]# [ 321.310943] workqueue:get_unbound_pool has empty cpumask for pool attrs
[ 321.310961] ------------[ cut here ]------------
[ 321.310997] WARNING: CPU: 184 PID: 13201 at kernel/workqueue.c:3375 alloc_unbound_pwq+0x5c0/0x5e0
[ 321.311005] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag sg pseries_rng ghash_generic gf128mul xts vmx_crypto binfmt_misc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
[ 321.311097] CPU: 184 PID: 13201 Comm: cpuhp/184 Not tainted 4.12.0-rc1.wi91275_debug_03.ppc64le+ #8
[ 321.311106] task: c000000408961080 task.stack: c000000406394000
[ 321.311113] NIP: c000000000116c80 LR: c000000000116c7c CTR: 0000000000000000
[ 321.311121] REGS: c0000004063977b0 TRAP: 0700 Not tainted (4.12.0-rc1.wi91275_debug_03.ppc64le+)
[ 321.311128] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
[ 321.311150] CR: 28000082 XER: 00000000
[ 321.311159] CFAR: c000000000a2dc80 SOFTE: 1
[ 321.311159] GPR00: c000000000116c7c c000000406397a30 c0000000013ae900 000000000000003b
[ 321.311159] GPR04: c000000408961a38 0000000000000006 00000000a49e41e5 ffffffffa4a5a483
[ 321.311159] GPR08: 00000000000062cc 0000000000000000 0000000000000000 c000000408961a38
[ 321.311159] GPR12: 0000000000000000 c00000000fb38c00 c00000000011e858 c00000040a902ac0
[ 321.311159] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 321.311159] GPR20: c000000406394000 0000000000000002 c000000406394000 0000000000000000
[ 321.311159] GPR24: c000000405075400 c000000404fc0000 0000000000000110 c0000000015a4c88
[ 321.311159] GPR28: 0000000000000000 c0000004fe256000 c0000004fe256008 c0000004fe052800
[ 321.311290] NIP [c000000000116c80] alloc_unbound_pwq+0x5c0/0x5e0
[ 321.311298] LR [c000000000116c7c] alloc_unbound_pwq+0x5bc/0x5e0
[ 321.311305] Call Trace:
[ 321.311310] [c000000406397a30] [c000000000116c7c] alloc_unbound_pwq+0x5bc/0x5e0 (unreliable)
[ 321.311323] [c000000406397ad0] [c000000000116e30] wq_update_unbound_numa+0x190/0x270
[ 321.311334] [c000000406397b60] [c000000000118eb0] workqueue_offline_cpu+0xe0/0x130
[ 321.311345] [c000000406397bf0] [c0000000000e9f20] cpuhp_invoke_callback+0x240/0xcd0
[ 321.311355] [c000000406397cb0] [c0000000000eab28] cpuhp_down_callbacks+0x78/0xf0
[ 321.311365] [c000000406397d00] [c0000000000eae6c] cpuhp_thread_fun+0x18c/0x1a0
[ 321.311376] [c000000406397d30] [c0000000001251cc] smpboot_thread_fn+0x2fc/0x3b0
[ 321.311386] [c000000406397dc0] [c00000000011e9c0] kthread+0x170/0x1b0
[ 321.311397] [c000000406397e30] [c00000000000b4f4] ret_from_kernel_thread+0x5c/0x68
[ 321.311406] Instruction dump:
[ 321.311413] 3d42fff0 892ac565 2f890000 40fefd98 39200001 3c62ff89 3c82ff6c 3863d590
[ 321.311437] 38847cb0 992ac565 48916fc9 60000000 <0fe00000> 4bfffd70 60000000 60420000
[ 321.311462] ---[ end trace 9f7c1cd616b26de8 ]---
[ 321.318347] Unable to handle kernel paging request for unaligned access at address 0xc0000003c5577ebf
[ 321.318448] Faulting instruction address: 0xc00000000055ec8c
[ 321.318457] Oops: Kernel access of bad area, sig: 7 [#1]
[ 321.318462] SMP NR_CPUS=2048
[ 321.318463] NUMA
[ 321.318468] pSeries
[ 321.318473] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag sg pseries_rng ghash_generic gf128mul xts vmx_crypto binfmt_misc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
[ 321.318524] CPU: 184 PID: 13201 Comm: cpuhp/184 Tainted: G W 4.12.0-rc1.wi91275_debug_03.ppc64le+ #8
[ 321.318532] task: c000000408961080 task.stack: c000000406394000
[ 321.318537] NIP: c00000000055ec8c LR: c0000000001312d4 CTR: c000000000145d50
[ 321.318544] REGS: c000000406397690 TRAP: 0600 Tainted: G W (4.12.0-rc1.wi91275_debug_03.ppc64le+)
[ 321.318551] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>
[ 321.318563] CR: 28000024 XER: 00000000
[ 321.318571] CFAR: c0000000001312d0 DAR: c0000003c5577ebf DSISR: 00000000 SOFTE: 0
[ 321.318571] GPR00: c000000000131298 c000000406397910 c0000000013ae900 c0000004b6d22820
[ 321.318571] GPR04: c0000004b6d22820 c0000003c5577ebf 0000000000000000 00000004f1230000
[ 321.318571] GPR08: 0000000d8ddb1ea7 0000000000000000 0000000000000008 c000000408961aa8
[ 321.318571] GPR12: c000000000145d50 c00000000fb38c00 c00000000011e858 c00000040a902ac0
[ 321.318571] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 321.318571] GPR20: c000000406394000 0000000000000002 0000000000004000 c000000000fb7700
[ 321.318571] GPR24: c0000000013f5d00 c0000000013f9d48 0000000000000000 c0000004b6d230e8
[ 321.318571] GPR28: 0000000000000004 00000003c45bfc57 0000000000000800 c0000004b6d22800
[ 321.318664] NIP [c00000000055ec8c] llist_add_batch+0xc/0x40
[ 321.318670] LR [c0000000001312d4] try_to_wake_up+0x524/0x850
[ 321.318675] Call Trace:
[ 321.318679] [c000000406397910] [c000000000131298] try_to_wake_up+0x4e8/0x850 (unreliable)
[ 321.318689] [c000000406397990] [c000000000111bf8] create_worker+0x148/0x220
[ 321.318696] [c000000406397a30] [c000000000116ae8] alloc_unbound_pwq+0x428/0x5e0
[ 321.318705] [c000000406397ad0] [c000000000116e30] wq_update_unbound_numa+0x190/0x270
[ 321.318713] [c000000406397b60] [c000000000118eb0] workqueue_offline_cpu+0xe0/0x130
[ 321.318721] [c000000406397bf0] [c0000000000e9f20] cpuhp_invoke_callback+0x240/0xcd0
[ 321.318729] [c000000406397cb0] [c0000000000eab28] cpuhp_down_callbacks+0x78/0xf0
[ 321.318737] [c000000406397d00] [c0000000000eae6c] cpuhp_thread_fun+0x18c/0x1a0
[ 321.318745] [c000000406397d30] [c0000000001251cc] smpboot_thread_fn+0x2fc/0x3b0
[ 321.318754] [c000000406397dc0] [c00000000011e9c0] kthread+0x170/0x1b0
[ 321.318762] [c000000406397e30] [c00000000000b4f4] ret_from_kernel_thread+0x5c/0x68
[ 321.318769] Instruction dump:
[ 321.318775] 60420000 38600000 4e800020 60000000 60420000 7c832378 4e800020 60000000
[ 321.318790] 60000000 e9250000 f9240000 7c0004ac <7d4028a8> 7c2a4800 40c20010 7c6029ad
[ 321.318808] ---[ end trace 9f7c1cd616b26de9 ]---
[ 321.322303]
[ 323.322505] Kernel panic - not syncing: Fatal exception
[ 323.429027] Rebooting in 10 seconds..

--
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line 363-5196
External: (512) 286-5196
Cell: (512) 466-0650
mwb@xxxxxxxxxxxxxxxxxx

Next message: Tim Chen: "Re: [PATCH v2 for-4.12-fixes 1/2] sched/fair: Use task_groups instead of leaf_cfs_rq_list to walk all cfs_rqs"
Previous message: David Daney: "[PATCH] test_bpf: Add a couple of tests for BPF_JSGE."
In reply to: Michael Bringmann: "Re: [PATCH] workqueue: Ensure that cpumask set for pools created after boot"
Next in thread: Tejun Heo: "Re: [PATCH] workqueue: Ensure that cpumask set for pools created after boot"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]