Re: INFO: task hung in hub_port_init

From: Shuah Khan
Date: Wed Sep 22 2021 - 14:15:19 EST


Hi Hao Sun,

On 9/20/21 8:31 AM, Shuah Khan wrote:
On 9/18/21 7:53 AM, Alan Stern wrote:
On Sat, Sep 18, 2021 at 10:17:26AM +0800, Hao Sun wrote:
Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> 于2021年9月18日周六 上午10:02写道:

On Sat, Sep 18, 2021 at 09:56:52AM +0800, Hao Sun wrote:
Hi Alan,

Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> 于2021年9月13日周一 下午9:55写道:

On Mon, Sep 13, 2021 at 11:13:15AM +0800, Hao Sun wrote:
Hello,

When using Healer to fuzz the Linux kernel, the following crash was triggered.

HEAD commit: ac08b1c68d1b Merge tag 'pci-v5.15-changes'
git tree: upstream
console output:
https://drive.google.com/file/d/1ZeDIMe-DoY3fB32j2p5ifgpq-Lc5N74I/view?usp=sharing
kernel config: https://drive.google.com/file/d/1qrJUXD8ZIeAkg-xojzDpp04v9MtQ8RR6/view?usp=sharing
Syzlang reproducer:
https://drive.google.com/file/d/1tZe8VmXfxoPqlNpzpGOd-e5WCSWgbkxB/view?usp=sharing
Similar report:
https://groups.google.com/g/syzkaller-bugs/c/zX55CUzjBOY/m/uf91r0XqAgAJ

Sorry, I don't have a C reproducer for this crash but have a Syzlang
reproducer. Also, hope the symbolized report can help.
Here are the instructions on how to execute Syzlang prog:
https://github.com/google/syzkaller/blob/master/docs/executing_syzkaller_programs.md

If you fix this issue, please add the following tag to the commit:
Reported-by: Hao Sun <sunhao.th@xxxxxxxxx>

There's not much hope of finding the cause of a problem like this
without seeing the kernel log.


Healer found another Syzlang prog to reproduce this task hang:
https://paste.ubuntu.com/p/HCNYbKJYtx/

Also here is a very simple script to execute the reproducer:
https://paste.ubuntu.com/p/ZTGmvFSP6d/

The `syz-execprog` and `syz-executor` are needed, so please build
Syzkaller first before running the script.
Hope this can help to find the root cause of the problem.

I don't have time to install and figure out how to use Healer and
Syzkaller.  But if you run the reproducer and post the kernel log,
I'll take a look at it.


Just executed the reproducer, here is the full log:
https://paste.ubuntu.com/p/x43SqQy8PX/

The log indicates that the problem is related to the vhci-hcd driver
somehow.  I don't know why those "Module has invalid ELF structures"
errors keep appearing, starting in line 1946 of the log.


Can you send me your config? This message is rather odd.

[ 82.249631][ T6679] Module has invalid ELF structures

It is right below:
[ 82.248529][ T6679] vhci_hcd vhci_hcd.0: Device attached

or

[ 83.860819][ T6710] vhci_hcd vhci_hcd.0: port 0 already used

My guess is this isn't the vhci_hcd module that gets loaded at this
point when we see this message, but another module that gets loaded
when vhci_hcd initiates probe after device attach. Note that vhci_hcd
is loaded earlier.

It is possible, the hung task might be related to load_module()
failure. Unfortunately load_module() doesn't print elf_validity_check()
error.

Would you be able to add this patch and run the reproducer again?


--------------------------------------------------------------------
diff --git a/kernel/module.c b/kernel/module.c
index 40ec9a030eec..02f758b04f05 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3941,7 +3941,8 @@ static int load_module(struct load_info *info, const char __user *uargs,
*/
err = elf_validity_check(info);
if (err) {
- pr_err("Module has invalid ELF structures\n");
+ pr_err("Module has invalid ELF structures error (%ld)\n",
+ err);
goto free_copy;
}

--------------------------------------------------------------------

thanks,
-- Shuah