Re: [workqueue] d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one_work

From: Xing Zhengjun
Date: Thu Jan 21 2021 - 20:50:44 EST

Next message: Sedat Dilek: "Re: [PATCH v6] pgo: add clang's Profile Guided Optimization infrastructure"
Previous message: Sedat Dilek: "Re: [PATCH v5] pgo: add clang's Profile Guided Optimization infrastructure"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 1/21/2021 12:00 PM, Hillf Danton wrote:

On Wed, 20 Jan 2021 21:46:33 +0800 Oliver Sang wrote:

On Fri, Jan 15, 2021 at 03:24:32PM +0800, Hillf Danton wrote:

Thu, 14 Jan 2021 15:45:11 +0800

FYI, we noticed the following commit (built with gcc-9):

commit: d5bff968ea9cc005e632d9369c26cbd8148c93d5 ("workqueue: break affinity initiatively")
https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git dev.2021.01.11b

[...]

[ 73.794288] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2192 process_one_work

Thanks for your report.

We can also break CPU affinity by checking POOL_DISASSOCIATED at attach
time without extra cost paid; that way we have the same behavior as at
the unbind time.

What is more the change that makes kworker pcpu is cut because they are
going to not help either hotplug or the mechanism of stop machine.

hi, by applying below patch, the issue still happened.

Thanks for your report.

[ 4.574467] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[ 4.575651] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[ 4.576900] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[ 4.578648] PCI: CLS 0 bytes, default 64
[ 4.579685] Unpacking initramfs...
[ 8.878031] -----------[ cut here ]-----------
[ 8.879083] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2187 process_one_work+0x92/0x9e0
[ 8.880688] Modules linked in:
[ 8.881274] CPU: 0 PID: 22 Comm: kworker/1:0 Not tainted 5.11.0-rc3-gc213503139bb #2

The kworker bond to CPU1 runs on CPU0 and triggers the warning, which
shows that scheduler breaks CPU affinity, after 06249738a41a
("workqueue: Manually break affinity on hotplug"), though quite likely
by kworker/1:0 for the initial workers.

[ 8.882518] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 8.887539] Workqueue: 0x0 (events)
[ 8.887838] EIP: process_one_work+0x92/0x9e0
[ 8.887838] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 74 85 13 00 ff 05 b8 30 04 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
[ 8.887838] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
[ 8.887838] ESI: 43c04720 EDI: 42e45620 EBP: de7f23c0 ESP: 43d7bf08
[ 8.887838] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
[ 8.887838] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
[ 8.887838] Call Trace:
[ 8.887838] ? worker_thread+0x98/0x6a0
[ 8.887838] ? worker_thread+0x2dd/0x6a0
[ 8.887838] ? kthread+0x1ba/0x1e0
[ 8.887838] ? create_worker+0x1e0/0x1e0
[ 8.887838] ? kzalloc+0x20/0x20
[ 8.887838] ? ret_from_fork+0x1c/0x28
[ 8.887838] _warn_unseeded_randomness: 63 callbacks suppressed
[ 8.887838] random: get_random_bytes called from init_oops_id+0x2b/0x60 with crng_init=0
[ 8.887838] --[ end trace ac461b4d54c37cfa ]--

Instead of creating the initial workers only on the active CPUS, rebind
them (labeled pcpu) and jump to the right CPU at bootup time.

--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2385,6 +2385,16 @@ woke_up:
return 0;
}
+ if (!(pool->flags & POOL_DISASSOCIATED) && smp_processor_id() !=
+ pool->cpu) {
+ /* scheduler breaks CPU affinity for us, rebind it */
+ raw_spin_unlock_irq(&pool->lock);
+ set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
+ /* and jump to the right seat */
+ schedule_timeout_interruptible(1);
+ goto woke_up;
+ }
+
worker_leave_idle(worker);
recheck:
/* no more worker necessary? */
--

I test the patch, the warning still appears in the kernel log.

[ 230.356503] smpboot: CPU 1 is now offline
[ 230.544652] x86: Booting SMP configuration:
[ 230.545077] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 230.545640] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
[ 230.545675] masked ExtINT on CPU#1
[ 230.593829] ------------[ cut here ]------------
[ 230.594257] WARNING: CPU: 0 PID: 257 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0
[ 230.594990] Modules linked in: rcutorture torture mousedev input_leds led_class pcspkr psmouse evbug tiny_power_button button
[ 230.595961] CPU: 0 PID: 257 Comm: kworker/1:3 Not tainted 5.11.0-rc3-gdcba55d9080f #2
[ 230.596621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 230.597322] Workqueue: 0x0 (rcu_gp)
[ 230.597636] EIP: process_one_work+0x92/0x9e0
[ 230.598005] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 f4 85 13 00 ff 05 cc 30 04 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
[ 230.599569] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
[ 230.600100] ESI: 43d94240 EDI: df4040f4 EBP: de7f23c0 ESP: bf5f1f08
[ 230.600629] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
[ 230.601203] CR0: 80050033 CR2: 01bdecbc CR3: 04e2c000 CR4: 000406d0
[ 230.601735] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 230.602265] DR6: fffe0ff0 DR7: 00000400
[ 230.602594] Call Trace:
[ 230.602813] ? process_one_work+0x20e/0x9e0
[ 230.603181] ? worker_thread+0x32d/0x700
[ 230.603522] ? kthread+0x1ba/0x1e0
[ 230.603818] ? create_worker+0x1e0/0x1e0
[ 230.604157] ? kzalloc+0x20/0x20
[ 230.604524] ? ret_from_fork+0x1c/0x28
[ 230.604850] ---[ end trace 06b1e66b5e17fa85 ]---
[ 230.605504] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
[ 230.766960] smpboot: CPU 1 is now offline
[ 230.814803] x86: Booting SMP configuration:
[ 230.815306] smpboot: Booting Node 0 Processor 1 APIC 0x1
[ 230.815964] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock

--
Zhengjun Xing

Attachment: dmesg.xz
Description: Binary data

Next message: Sedat Dilek: "Re: [PATCH v6] pgo: add clang's Profile Guided Optimization infrastructure"
Previous message: Sedat Dilek: "Re: [PATCH v5] pgo: add clang's Profile Guided Optimization infrastructure"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]