Re: [PATCH] usb: dwc3: Potential fix of possible dwc3 interrupt storm

From: Thinh Nguyen
Date: Fri Sep 06 2024 - 20:55:28 EST


On Sat, Sep 07, 2024, Selvarasu Ganesan wrote:
>
> Hi Thinh,
>
> I ran the code you recommended on our testing environment and was able
> to reproduce the issue one time.
>
> When evt->flags contains DWC3_EVENT_PENDING, I've included the following
> debugging information: I added this debug message at the start of
> dwc3_event_buffers_cleanup and dwc3_event_buffers_setup functions in
> during suspend and resume.
>
> The results were quite interesting . I'm curious to understand how
> evt->flags is set to DWC3_EVENT_PENDING, and along with DWC3_GEVNTSIZ is
> equal to 0x1000 during the suspend.

That is indeed strange.

> Its means that the previous bottom-half handler prior to suspend might
> still be executing in the middle of the process.
>
> Could you please give your suggestions here? And let me know if anything
> want to test or additional details are required.
>
>
> ##DBG: dwc3_event_buffers_cleanup:
>  evt->length    :0x1000
>  evt->lpos      :0x20c
>  evt->count     :0x0
>  evt->flags     :0x1 // This is Unexpected if DWC3_GEVNTSIZ(0)(0xc408):
> 0x00001000. Its means that previous bottom-half handler may be still
> running in middle

Perhaps.

But I doubt that's the case since it shouldn't take that long for the
bottom-half to be completed before the next resume yet the flag is still
set.

>
>  DWC3_GEVNTSIZ(0)(0xc408)       : 0x00001000
>  DWC3_GEVNTCOUNT(0)(0xc40c)     : 0x00000000
>  DWC3_DCFG(0xc700)              : 0x00e008a8
>  DWC3_DCTL(0xc704)              : 0x0cf00a00
>  DWC3_DEVTEN(0xc708)            : 0x00000000
>  DWC3_DSTS(0xc70c)              : 0x00d20cd1
>

The controller status is halted. So there's no problem with
soft-disconnect. For the interrupt mask in GEVNTSIZ to be cleared,
that likely means that the bottom-half had probably completed.

>
> ##DBG: dwc3_event_buffers_setup:
>  evt->length    :0x1000
>  evt->lpos      :0x20c

They fact that evt->lpos did not get updated tells me that there's
something wrong with memory access to your platform during suspend and
resume.

>  evt->count     :0x0
>  evt->flags     :0x1 // Still It's not clearing in during resume.
>
>  DWC3_GEVNTSIZ(0)(0xc408)       : 0x00000000
>  DWC3_GEVNTCOUNT(0)(0xc40c)     : 0x00000000
>  DWC3_DCFG(0xc700)              : 0x00080800
>  DWC3_DCTL(0xc704)              : 0x00f00000
>  DWC3_DEVTEN(0xc708)            : 0x00000000
>  DWC3_DSTS(0xc70c)              : 0x00d20001
>

Please help look into your platform to see what condition triggers this
memory access issue. If this is a hardware quirk, we can properly update
the change and note it to be so.

Thanks,
Thinh

(If possible, for future tests, please dump the dwc3 tracepoints. Many
thanks for the tests.)