Re: [PATCH] usb: dwc3: Add dwc3 lock for blocking interrupt storming

From: Thinh Nguyen
Date: Thu Mar 10 2022 - 20:57:17 EST


정재훈 wrote:
> Hi.
>
>> -----Original Message-----
>> From: Thinh Nguyen [mailto:Thinh.Nguyen@xxxxxxxxxxxx]
>> Sent: Thursday, March 10, 2022 11:14 AM
>> To: JaeHun Jung; Felipe Balbi; Greg Kroah-Hartman
>> Cc: open list:USB XHCI DRIVER; open list; Seungchull Suh; Daehwan Jung
>> Subject: Re: [PATCH] usb: dwc3: Add dwc3 lock for blocking interrupt
>> storming
>>
>> Hi,
>>
>> JaeHun Jung wrote:
>>> Interrupt Storming occurred with a very low probability of occurrence.
>>> The occurrence of the problem is estimated to be caused by a race
>>> condition between the top half and bottom half of the interrupt service
>> routine.
>>> It was confirmed that variables have values that cannot be held when
>>> ISR occurs through normal H / W irq.
>>> =====================================================================
>>> (struct dwc3_event_buffer *) ev_buf = 0xFFFFFF88DE6A0380 (
>>> (void *) buf = 0xFFFFFFC01594E000,
>>> (void *) cache = 0xFFFFFF88DDC14080,
>>> (unsigned int) length = 4096,
>>> (unsigned int) lpos = 0,
>>> (unsigned int) count = 0, <<
>>> (unsigned int) flags = 1, <<
>>> =====================================================================
>>> "evt->count=0" and "evt->flags=DWC3_EVENT_PENDING" cannot be set at
>>> the same time.
>>>
>>> We estimate that a race condition occurred between dwc3_interrupt()
>>> and dwc3_process_event_buf() called by
>>> dwc3_gadget_process_pending_events().
>>> So I try to block the race condition through spin_lock.
>>
>> This looks like it needs a memory barrier. Would this work for you?
> Maybe it could be. But "evt->count = 0;" is updated on dwc3_process_event_buf().
> So, I think spin_lock is more clear routine for this issue.
>

Not really. If problem is due to the evt->flags not updated in time,
then the solution should be using the memory barrier. The spin_lock
would obfuscate the issue. And we should avoid using spin_lock in the
top-half.

BR,
Thinh

>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index
>> c02e239978e0..a96c344b9f17 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -5340,6 +5340,9 @@ static irqreturn_t dwc3_check_event_buf(struct
>> dwc3_event_buffer *evt)
>> return IRQ_HANDLED;
>> }
>>
>> + /* Make sure the event flags is updated */
>> + wmb();
>> +
>> /*
>> * With PCIe legacy interrupt, test shows that top-half irq handler
>> can
>> * be called again after HW interrupt deassertion. Check if bottom-
>> half
>>
>>
>> Thanks,
>> Thinh
>