Re: [PATCH] usb: dwc3: gadget: fix miss isoc issue introduced by IRQ latency

From: Thinh Nguyen
Date: Mon Dec 17 2018 - 15:12:35 EST


Hi Zengtao,

On 12/16/2018 5:45 PM, Zengtao (B) wrote:
>>>>>> If it's a busy system, some times when we start an isoc transfer,
>>>>>> the framenumber get from the event buffer may be already elasped,
>>>>>> in this case, we will get all the packets dropped due to miss isoc.
>>>>>> And we turn into transfer nothing, to fix this issue, we need to
>>>>>> fix the framenumber to make sure that it's not out of date.
>>>>>>
>>>>>> Signed-off-by: Liang Shengjun <liangshengjun@xxxxxxxxxxxxx>
>>>>>> Signed-off-by: Zeng Tao <prime.zeng@xxxxxxxxxxxxx>
>>>>> There's a patch going upstream already doing this:
>>>>>
>>>>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_balbi_usb.git_commit&d=DwIGaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=u9FYoxKtyhjrGFcyixFYqTjw1ZX0VsG2d8FCmzkTY-w&m=yGrN_JnamczN1G0wY4Th67vAfxSJEouaUgJdu0u6Av8&s=v4xeWOLyCSNoHxHS6kI8_bJfHgddFsPkqzgLFTJZOG8&e=
>>>>> /?h
>>>>> =next&id=d53701067f048b8b11635e964b6d3bd9a248c622
>>>>>
>>>> Sorry, I think I missed to update to the latest version. But I think
>>>> my patch is more efficient. Because I just sync the frame from the
>>>> HW, it won't fail and there is no need to extra tries.
>>>>
>>>> What do you think about it?
>>> the 14 bits you use for your check is not representative of the actual
>>> micro-frame you're currently in. Thinh explained that in the
>>> discussion we had until we came to the patch I pointed you to above.
>>> Please look at the mailing list archives for details.
>>>
>> There are several issues with this patch.
>> 1) Your frame elapsed time check is not based on interval but statically
>> DWC3_EVENT_PRAM_SOFFN / 2. That's about 1 second. So it doesn't
>> account for isoc transfers with large service interval of 1 sec or more.
>
> This is just for checking whether the Frame number is overflow or not. So
> 1 second should a reason value.

Why when there's another way to solve this issue without creating a new
logic that only works within 1 second limit.

>
>> 2) This function __dwc3_gadget_target_frame_elapsed() should have
>> comments for what it does. The name implies that this function checks
>> for eframe > cframe, and not eframe > cframe + 1s.
> eframe > cframe + 1s is used to deal with the Frame number overflow.

Right. You need to provide comments for that. It's not obvious that
DWC3_EVENT_PRAM_SOFFN / 2 is around 1 second.

>
>> 3) If this check fails, then it will do DWC3_ALIGN_FRAME() at least twice.
>> The isoc transfer will start 1 more interval into the future than it needs to.
>>
> If the interval is one, 1 more interval should be more reasonable because the
> core always fetch the trb one frame ahead.

Did it fail with your setup when you test with interval of 1? It didn't
fail from my testing setup. Your check applies for all intervals and not
just interval of 1. Also, the databook doesn't mention this limitation
that you have to do 2 intervals into the future if the service interval
is 1.

Regardless, if the __dwc3_gadget_target_frame_elapsed() passes for
service interval of 1, then you'd only do DWC3_ALIGN_FRAME() once. This
is different than what you said/wanted to do.

>
>> Also, I think the role of this check should be from the controller as it has
>> more information and its own logic to decide if the selected future uframe
>> has elapsed.
> Yes, agree, but I think if the sw can be used to do the same thing and more
> effective, Why not?
>

No. The SW cannot do the same thing. As Felipe mentioned previously,
__dwc3_gadget_get_frame() can only get a 14-bit frame number. The
controller keeps track of the frame number with at least 16 bits. It has
more information and can work with larger service intervals.

Can you test with Felipe's patches to see if they work for you? We can
come back and review if there are still issues.

Thanks,
Thinh