Re: [syzbot] [usb?] INFO: task hung in hub_port_init (3)
From: Alan Stern
Date: Wed Nov 08 2023 - 11:12:07 EST
On Wed, Nov 08, 2023 at 04:25:45PM +0500, Muhammad Usama Anjum wrote:
> I've confirmed locally again that the logs belong to same urb. This kworker
> gets stuck:
>
> [ 131.064283] usb_control_msg
> [ 131.065326] usb_internal_control_msg, urb: FFFF88814CC2AE00
> urb->use_count: 0
> [ 131.066320] usb_start_wait_urb urb: FFFF88814CC2AE00 urb->use_count: 0
> [ 131.069988] usb_submit_urb urb: FFFF88814CC2AE00 urb->use_count: 0
> [ 131.070881] usb_hcd_submit_urb urb: FFFF88814CC2AE00 urb->use_count 1
> [ 131.072268] usb_submit_urb 0 urb: FFFF88814CC2AE00 urb->use_count: 1
> [ 131.073186] usb_start_wait_urb urb: FFFF88814CC2AE00 urb->use_count: 1
> [ 136.151750] usb_start_wait_urb wait_for_completion
> [ 136.153286] usb_kill_urb might_sleep
> [ 136.153859] vhci_hcd: vhci_urb_dequeue:875: vhci_urb_dequeue
> [ 136.154853] vhci_hcd: vhci_urb_dequeue:952: vhci_urb_dequeue return
> [ 136.155773] usb_kill_urb usb_hcd_unlink_urb use_count: 1
> [ 285.831355] INFO: task kworker/0:4:1586 blocked for more than 143 seconds.
Of course. It's waiting for the vhci_urb_dequeue() call to finish
unlinking the URB.
> > If you want to fix this problem (and probably a bunch of other ones in
> > syzbot's list of pending bugs), figure out what's wrong with the
> > ->urb_dequeue() callback routine in the usbip driver and fix it.
> I'm looking at it, haven't found anything yet.
I took a very quick look just now, and one thing stands out. If
vhci_urb_dequeue() is unable to allocate a vhci_unlink structure, it
calls usbip_event_add() and then returns without doing anything else.
But one of the things usbip_event_add() does is try to allocate a
usbip_event structure, and if that allocation fails then it returns
without doing anything. Now, if the memory allocation attempt in
vhci_urb_dequeue() fails then it seems quite likely that the attempt in
usbip_event_add() will also fail. Which means that nothing will happen
-- and that is a bug. URB-dequeue calls are not allowed to fail because
of memory pressure.
Now, I don't know if this is the cause of the trouble in the syzbot
test. You should trace what's going on in vhci_urb_dequeue() to see
exactly what it does.
Alan Stern