回复: [kthread] a90477f0c9: WARNING:at_kernel/kthread.c:#kthread_queue_work

From: Zhang, Qiang
Date: Mon Jul 06 2020 - 09:45:59 EST




________________________________________
发件人: Zhang, Qiang <Qiang.Zhang@xxxxxxxxxxxxx>
发送时间: 2020年7月6日 18:17
收件人: Petr Mladek
抄送: Andrew Morton; linux-kernel@xxxxxxxxxxxxxxx
主题: 回复: [kthread] a90477f0c9: WARNING:at_kernel/kthread.c:#kthread_queue_work

Hi , Petr Mladek
There some question for "Work could not be queued when worker being destroyed" patch,

when in "spi_init_queue" func :
"kthread_init_worker(&ctlr->kworker); (worker->task = NULL)
ctlr->kworker_task = kthread_run(kthread_worker_fn, &ctlr->kworker,
"%s", dev_name(&ctlr->dev)); "

in "spi_start_queue" func:
"kthread_queue_work(&ctlr->kworker, &ctlr->pump_messages);"

Becasue the kthread_worker_fn is not begin running, if queue work to worker,
the "!worker->task" = true, trigger WARN.

Are Need to " test_bit(KTHREAD_SHOULD_STOP, &to_kthread(current)->flags) " replace "WARN_ON(!worker->task)" in queuing_blocked func
or
"kthread_create_worker" replace "kthread_run(kthread_worker_fn.." in spi_init_queue func
(because in kthread_create_worker, "worker->task" will be assigned a value) ?

Zhang Qiang




________________________________________
发件人: kernel test robot <rong.a.chen@xxxxxxxxx>
发送时间: 2020年7月6日 17:38
收件人: Zhang, Qiang
抄送: lkp@xxxxxxxxxxxx
主题: [kthread] a90477f0c9: WARNING:at_kernel/kthread.c:#kthread_queue_work

Greeting,

FYI, we noticed the following commit (built with gcc-7):

commit: a90477f0c956621eb0dd69f0abfb6066ad8fbef7 ("kthread: work could not be queued when worker being destroyed")
https://github.com/hnaz/linux-mm master

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-------------------------------------------------+------------+------------+
| | 85c6127e06 | a90477f0c9 |
+-------------------------------------------------+------------+------------+
| boot_successes | 6 | 0 |
| boot_failures | 0 | 30 |
| WARNING:at_kernel/kthread.c:#kthread_queue_work | 0 | 30 |
| EIP:kthread_queue_work | 0 | 30 |
| BUG:kernel_hang_in_test_stage | 0 | 2 |
+-------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@xxxxxxxxx>


[ 5.554282] WARNING: CPU: 0 PID: 1 at kernel/kthread.c:817 kthread_queue_work+0xf8/0x120
[ 5.556204] Modules linked in:
[ 5.556204] CPU: 0 PID: 1 Comm: swapper Tainted: G S 5.8.0-rc3-00014-ga90477f0c9566 #1
[ 5.556204] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 5.556204] EIP: kthread_queue_work+0xf8/0x120
[ 5.556204] Code: 00 59 e9 67 ff ff ff 8d 76 00 8b 4e 10 85 c9 75 a6 8d 4b 28 89 f2 89 d8 bf 01 00 00 00 e8 f0 f5 ff ff eb 93 8d b6 00 00 00 00 <0f> 0b 6a 00 31 c9 ba 01 00 00 00 b8 08 6c 64 c3 e8 f3 04 0b 00 5b
[ 5.556204] EAX: 00000000 EBX: eeff538c ECX: 00000000 EDX: 00000001
[ 5.556204] ESI: eeff53d0 EDI: 00000000 EBP: f5edde70 ESP: f5edde5c
[ 5.556204] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010046
[ 5.556204] CR0: 80050033 CR2: b7eda844 CR3: 038ae000 CR4: 000406d0
[ 5.556204] Call Trace:
[ 5.556204] spi_start_queue+0x50/0x70
[ 5.556204] spi_register_controller+0x642/0xa80
[ 5.556204] ? kobject_get+0x54/0xf0
[ 5.556204] ? parport_pc_platform_probe+0x10/0x10
[ 5.556204] spi_bitbang_start+0x2f/0x70
[ 5.556204] ? parport_pc_platform_probe+0x10/0x10
[ 5.556204] butterfly_attach+0x164/0x2c0
[ 5.556204] ? driver_detach+0x30/0x30
[ 5.556204] port_check+0x1c/0x30
[ 5.556204] bus_for_each_dev+0x5a/0x90
[ 5.556204] __parport_register_driver+0x76/0xa0
[ 5.556204] ? driver_detach+0x30/0x30
[ 5.556204] ? spi_engine_driver_init+0x16/0x16
[ 5.556204] butterfly_init+0x19/0x1b
[ 5.556204] do_one_initcall+0x79/0x310
[ 5.556204] ? parse_args+0x70/0x420
[ 5.556204] ? rcu_read_lock_sched_held+0x2f/0x50
[ 5.556204] ? trace_initcall_level+0x95/0xc7
[ 5.556204] ? kernel_init_freeable+0x129/0x19f
[ 5.556204] kernel_init_freeable+0x148/0x19f
[ 5.556204] ? rest_init+0x100/0x100
[ 5.556204] kernel_init+0xd/0xf0
[ 5.556204] ret_from_fork+0x1c/0x28
[ 5.556204] irq event stamp: 8620410
[ 5.556204] hardirqs last enabled at (8620409): [<c2675daa>] _raw_spin_unlock_irqrestore+0x2a/0x50
[ 5.556204] hardirqs last disabled at (8620410): [<c2675bb4>] _raw_spin_lock_irqsave+0x14/0x40
[ 5.556204] softirqs last enabled at (8620262): [<c2677900>] __do_softirq+0x2e0/0x480
[ 5.556204] softirqs last disabled at (8620249): [<c1023cf5>] call_on_stack+0x45/0x50
[ 5.556204] ---[ end trace 184061a442b5fd6f ]---


To reproduce:

# build kernel
cd linux
cp config-5.8.0-rc3-00014-ga90477f0c9566 .config
make HOSTCC=gcc-7 CC=gcc-7 ARCH=i386 olddefconfig prepare modules_prepare bzImage

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
Rong Chen