RE: [PATCH v7 01/10] usb: gadget: udc: Add timer support for usb requests

From: Anurag Kumar Vulisha
Date: Wed Dec 05 2018 - 10:43:56 EST


Hi Alan,

>-----Original Message-----
>From: Alan Stern [mailto:stern@xxxxxxxxxxxxxxxxxxx]
>Sent: Wednesday, December 05, 2018 12:59 AM
>To: Anurag Kumar Vulisha <anuragku@xxxxxxxxxx>
>Cc: Felipe Balbi <balbi@xxxxxxxxxx>; Greg Kroah-Hartman
><gregkh@xxxxxxxxxxxxxxxxxxx>; Shuah Khan <shuah@xxxxxxxxxx>; Johan Hovold
><johan@xxxxxxxxxx>; Jaejoong Kim <climbbb.kim@xxxxxxxxx>; Benjamin
>Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>; Roger Quadros <rogerq@xxxxxx>; Manu
>Gautam <mgautam@xxxxxxxxxxxxxx>; martin.petersen@xxxxxxxxxx; Bart Van
>Assche <bvanassche@xxxxxxx>; Mike Christie <mchristi@xxxxxxxxxx>; Matthew
>Wilcox <willy@xxxxxxxxxxxxx>; Colin Ian King <colin.king@xxxxxxxxxxxxx>; linux-
>usb@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; v.anuragkumar@xxxxxxxxx;
>Thinh Nguyen <thinhn@xxxxxxxxxxxx>; Tejas Joglekar
><tejas.joglekar@xxxxxxxxxxxx>; Ajay Yugalkishore Pandey <APANDEY@xxxxxxxxxx>
>Subject: RE: [PATCH v7 01/10] usb: gadget: udc: Add timer support for usb requests
>
>On Tue, 4 Dec 2018, Anurag Kumar Vulisha wrote:
>
>> >Is there any way for the device controller driver to detect the problem without
>relying
>> >on a timeout?
>> >
>> >In fact, is this problem a known bug in the existing device controller hardware?
>Have
>> >the controller manufacturers published a suggested scheme for working around it?
>> >
>>
>> Yes, this problem is mentioned in dwc3 controller data book and the workaround
>> mentioned is the same that we are doing , to implement timeout and if no valid
>> stream event is found , after timeout issue end transfer followed by start transfer.
>
>Okay. But this implies that the problem is confined to DWC3 and
>doesn't affect other types of controllers, which would mean modifying
>the UDC core would be inappropriate.
>

Both host & gadget are equally responsible for this issue and it may potentially occur
with other controllers also(which are incapable of handling this situation) . Because of
this reason I would not call this issue as a dwc3 alone bug. One thing is that, with dwc3 the
issue is easily reproduced. Because of these reasons I feel that it would be better to have
a generic udc timers rather than having driver specific workaround. We had this same
discussion earlier in this mailing list https://lkml.org/lkml/2018/10/9/638

>Does the data book suggest a value for the timeout?
>

No, the databook doesn't mention about the timeout value

>> >At this point, it seems that the generic approach will be messier than having every
>> >controller driver implement its own fix. At least, that's how it appears to me.
>
>(Especially if dwc3 is the only driver affected.)
>

As discussed above, the issue may happen with other gadgets too. As I got divide opinions
on this implementation and both the implementations looks fine to me, I am little confused
on which should be implemented.

@Felipe: Do you agree with Alan's implementation? Please let us know your suggestion
on this.

>> With the dequeue approach there is an advantage(not related to this issue) that the
>> class driver can have control of the timed out transfer whether to requeue it or
>consider
>> it as an error (based on implementation). This approach gives more flexibility to the
>class
>> drivers. This may be out of context of this issue but wanted to point out that we
>may lose
>> this advantage on switching to older implementation.
>
>Class drivers can cancel and requeue requests at any time. There's no
>extra flexibility.
>

I agree with you, but the class drivers have to implement their own logic instead of
having a generic logic to implement the timeouts.

>> >Ideally it would not be necessary to rely on a timeout at all.
>> >
>> >Also, maintainers dislike module parameters. It would be better not to add one.
>>
>> Okay. I would be happy if any alternative for this issue is present but unfortunately
>> I am not able to figure out any alternative other than timers. If not
>module_params()
>> we can add an configfs entry in stream gadget to update the timeout. Please
>provide
>> your opinion on this approach.
>
>Since the purpose of the timeout is to detect a deadlock caused by a
>hardware bug, I suggest a fixed and relatively short timeout value such
>as one second. Cancelling and requeuing a few requests at 1-second
>intervals shouldn't add very much overhead.
>

Thanks for suggesting. 1 sec timeout should be fair enough. I will change this
in next series

Best Regards,
Anurag Kumar Vulisha