Re: [PATCH 1/1] Net: bcm.c: Remove Subtree Instead of Entry

From: David Hunter
Date: Fri Aug 09 2024 - 12:21:55 EST


Hello Oliver,

> What did you do to trigger the warning?

I am in the Linux Kernel Internship Program for the Linux Foundation. Our goal is to fix outstanding bugs with the kernel. I found the following bug on syzbot:

https://syzkaller.appspot.com/bug?extid=df49d48077305d17519a

This specific link is for a separate issue that I will soon send a separate patch for; however, I found the bug for this patch after I switched the command parameter for panic_on_warn to 0.

If you wish to reproduce the error, you can do the following steps:
1) compile and install a kernel with the config file from the link
2) pass kernel paramter panic_on_warn=0
3) build and run the C reproducer for the bug.

As best as I can tell, the C reproducer simply made a system call that resulted in the bcm-can directry entry being deleted. I am still wrapping my head around the code (I am new to kernel programming), but here is the full stacktrace.

156.449047][ T71] Call Trace:
[ 156.450067][ T71] <TASK>
[ 156.451076][ T71] ? show_regs+0x84/0x8b
[ 156.452490][ T71] ? __warn+0x150/0x29e
[ 156.453754][ T71] ? remove_proc_entry+0x335/0x385
[ 156.456485][ T71] ? report_bug+0x33d/0x431
[ 156.457994][ T71] ? remove_proc_entry+0x335/0x385
[ 156.459845][ T71] ? handle_bug+0x3d/0x66
[ 156.461230][ T71] ? exc_invalid_op+0x17/0x3e
[ 156.462672][ T71] ? asm_exc_invalid_op+0x1a/0x20
[ 156.464282][ T71] ? __warn_printk+0x26d/0x2aa
[ 156.465759][ T71] ? remove_proc_entry+0x335/0x385
[ 156.467233][ T71] ? remove_proc_entry+0x334/0x385
[ 156.468821][ T71] ? proc_readdir+0x11a/0x11a
[ 156.470122][ T71] ? __sanitizer_cov_trace_pc+0x1e/0x42
[ 156.471697][ T71] ? cgw_remove_all_jobs+0xa5/0x16f
[ 156.474096][ T71] canbcm_pernet_exit+0x73/0x79
[ 156.476732][ T71] ops_exit_list+0xf1/0x146
[ 156.478358][ T71] cleanup_net+0x333/0x570
[ 156.479856][ T71] ? setup_net+0x7ba/0x7ba
[ 156.481479][ T71] ? process_scheduled_works+0x652/0xbab
[ 156.483592][ T71] process_scheduled_works+0x7b8/0xbab
[ 156.486039][ T71] ? drain_workqueue+0x33b/0x33b
[ 156.487841][ T71] ? __sanitizer_cov_trace_pc+0x1e/0x42
[ 156.489742][ T71] ? move_linked_works+0x9f/0x108
[ 156.491376][ T71] worker_thread+0x5bd/0x6cc
[ 156.492877][ T71] ? rescuer_thread+0x64d/0x64d
[ 156.494350][ T71] kthread+0x30a/0x31e
[ 156.495769][ T71] ? kthread_complete_and_exit+0x35/0x35
[ 156.497977][ T71] ret_from_fork+0x34/0x6b
[ 156.499734][ T71] ? kthread_complete_and_exit+0x35/0x35
[ 156.501494][ T71] ret_from_fork_asm+0x11/0x20

> Removing this warning probably does not heal the root cause of the issue.

I would love to work on the root cause of the issue if at all possible. Do you think that the C reproducer went down an unlikely avenue, and therefore, further work is not needed, or do you think that this is an issue that requires some attention?

I appreciate the response to my patch. I am learning a lot.

Thanks,
David