Re: [PATCH] usb: gadget: composite: split spinlock to avoid recursion

From: Peter Chen
Date: Fri Nov 15 2019 - 01:59:16 EST


On 19-11-14 14:23:30, Michael Olbrich wrote:
> On Wed, Nov 13, 2019 at 10:36:25AM -0500, Alan Stern wrote:
> > On Wed, 13 Nov 2019, Peter Chen wrote:
> > > On 19-11-12 10:33:18, Michael Olbrich wrote:
> > > > 'delayed_status' and 'deactivations' are used completely independent but
> > > > they share the same spinlock. This can result in spinlock recursion:
> > > >
> > > > BUG: spinlock recursion on CPU#1, uvc-gadget/322
> > > > lock: 0xffffffc0570364e0, .magic: dead4ead, .owner: uvc-gadget/322, .owner_cpu: 1
> > > > CPU: 1 PID: 322 Comm: uvc-gadget Tainted: G C O 5.3.0-20190916-1+ #55
> > > > Hardware name: XXXXX (DT)
> > > > Call trace:
> > > > dump_backtrace+0x0/0x178
> > > > show_stack+0x24/0x30
> > > > dump_stack+0xc0/0x104
> > > > spin_dump+0x90/0xa0
> > > > do_raw_spin_lock+0xd8/0x108
> > > > _raw_spin_lock_irqsave+0x40/0x50
> > > > composite_disconnect+0x2c/0x80
> > > > usb_gadget_disconnect+0x84/0x150
> > > > usb_gadget_deactivate+0x64/0x120
> > > > usb_function_deactivate+0x70/0x80
> > > > uvc_function_disconnect+0x20/0x58
> > > > uvc_v4l2_release+0x34/0x90
> > > > v4l2_release+0xbc/0xf0
> > > > __fput+0xb0/0x218
> > > > ____fput+0x20/0x30
> > > > task_work_run+0xa0/0xd0
> > > > do_notify_resume+0x2f4/0x340
> > > > work_pending+0x8/0x14
> > > >
> > > > Fix this by using separate spinlocks.
> > >
> > > This issue may be introduced by 0a55187a1ec8c ("USB: gadget core: Issue
> > > ->disconnect() callback from usb_gadget_disconnect()"), which adds
> > > gadget's disconnect at usb_gadget_disconnect. Add Alan, if he is Ok
> > > with your patch, you may cc to stable tree.
> >
> > I wasn't aware of the dual usage of that lock in the composite core
> > (and 0a55187a1ec8c touches only the gadget core, not composite.c).
> >
> > In any case, I don't have a good feel for how the locking is supposed
> > to work in the composite core. This is really something Felipe should
> > look at.
> >
> > Would a better fix be to change usb_function_deactivate() so that it
> > doesn't hold the lock while calling usb_gadget_deactivate()? Maybe
> > increment cdev->deactivations unconditionally before dropping the lock
> > (for mutual exclusion) and then decrement it again if the call fails?
>
> Hmm, I think, that would mean that usb_gadget_activate() may be called
> while usb_gadget_deactivate() is still running right? That's not
> acceptable, is it?
>

I agree with Alan. Return error for usb_function_activate if
usb_function_deactive is running could be accepted, it seems there is
no other good way to fix this lock recursion and your below atomic
issue, let's see what Felipe will say.

Peter

> Anyways. Something else is needed because executing usb_gadget_deactivate()
> under the spinlock has another problem. It's hard to reproduce, but we've
> seen this one:
>
> BUG: scheduling while atomic: pipewire/260/0x00000002
> Modules linked in: allegro(C) regmap_mmio v4l2_mem2mem xlnx_vcu st1232 uio_pdrv_genirq
> Preemption disabled at: [<ffffff801061dc40>] usb_function_deactivate+0x30/0x80
> CPU: 1 PID: 260 Comm: pipewire Tainted: G C O 5.3.0-20191112-1 #2
> Hardware name: Wolfvision ZynqMP PF4 (DT)
> Call trace:
> dump_backtrace+0x0/0x178
> show_stack+0x24/0x30
> dump_stack+0xc0/0x104
> __schedule_bug+0xb0/0xc0
> __schedule+0x354/0x4d8
> schedule+0x44/0xd8
> schedule_timeout+0x1b4/0x380
> wait_for_common+0xc0/0x188
> wait_for_completion_timeout+0x2c/0x38
> dwc3_gadget_pullup+0x90/0xb0
> usb_gadget_disconnect+0x38/0x150
> usb_gadget_deactivate+0x64/0x120
> usb_function_deactivate+0x70/0x80
> uvc_function_disconnect+0x20/0x58
> uvc_v4l2_release+0x34/0x90
> v4l2_release+0xbc/0xf0
> __fput+0x90/0x208
> ____fput+0x20/0x30
> task_work_run+0x98/0xb8
> do_notify_resume+0x2f4/0x340
> work_pending+0x8/0x14
> dwc3 fe200000.usb: timed out waiting for SETUP phase
>

Hi Michael,


--

Thanks,
Peter Chen