Re: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

From: Greg KH
Date: Wed Jan 08 2020 - 09:28:34 EST


On Wed, Jan 08, 2020 at 12:37:18PM +0100, Pawel Laszczak wrote:
> The ARM core hang when access USB register after tens of thousands
> connect/disconnect operation.
>
> The issue was observed on platform with android system and is not easy
> to reproduce. During test controller works at HS device mode with host
> connected.
>
> The test is based on continuous disabling/enabling USB device function
> what cause continuous setting DEVDS/DEVEN bit in USB_CONF register.
>
> For testing was used composite device consisting from ADP and RNDIS
> function.
>
> Presumably the problem was caused by DMA transfer made after setting
> DEVDS bit. To resolve this issue fix stops all DMA transfer before
> setting DEVDS bit.
>
> Signed-off-by: Pawel Laszczak <pawell@xxxxxxxxxxx>
> Signed-off-by: Peter Chan <peter.chan@xxxxxxx>
> Reported-by: Peter Chan <peter.chan@xxxxxxx>
> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")

As this is in the 5.4 kernel release, you should have a "Cc: stable..."
line in here, right?

> ---
> drivers/usb/cdns3/gadget.c | 84 ++++++++++++++++++++++++++------------
> drivers/usb/cdns3/gadget.h | 1 +
> 2 files changed, 58 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/usb/cdns3/gadget.c b/drivers/usb/cdns3/gadget.c
> index 4c1e75509303..277ed8484032 100644
> --- a/drivers/usb/cdns3/gadget.c
> +++ b/drivers/usb/cdns3/gadget.c
> @@ -1516,6 +1516,49 @@ static int cdns3_ep_onchip_buffer_reserve(struct cdns3_device *priv_dev,
> return 0;
> }
>
> +static int cdns3_disable_reset_ep(struct cdns3_device *priv_dev,
> + struct cdns3_endpoint *priv_ep)
> +{
> + unsigned long flags;
> + u32 val;
> + int ret;
> +
> + spin_lock_irqsave(&priv_dev->lock, flags);
> +
> + if (priv_ep->flags & EP_HW_RESETED) {

Is "reseted" a word? :)

> + spin_unlock_irqrestore(&priv_dev->lock, flags);
> + return 0;
> + }
> +
> + cdns3_select_ep(priv_dev, priv_ep->endpoint.desc->bEndpointAddress);
> +
> + val = readl(&priv_dev->regs->ep_cfg);
> + val &= ~EP_CFG_ENABLE;
> + writel(val, &priv_dev->regs->ep_cfg);
> +
> + /**

No need for kernel-doc comment style please.

> + * Driver needs some time before resetting endpoint.
> + * It need waits for clearing DBUSY bit or for timeout expired.
> + * 10us is enough time for controller to stop transfer.
> + */
> + readl_poll_timeout_atomic(&priv_dev->regs->ep_sts, val,
> + !(val & EP_STS_DBUSY), 1, 10);

You don't care if you time out?

> + writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
> +
> + ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
> + !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
> + 1, 1000);
> +
> + if (unlikely(ret))

Unless you can measure the difference of using/not using a
unlikely/likely mark, NEVER use it. The compiler and cpu can almost
always do better than you can, we have the tests to prove it.

> + dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
> + priv_ep->name);
> +
> + priv_ep->flags |= EP_HW_RESETED;

So an error happens, but you still claim the device is reset? What can
a user do about this error?

Yes, I know you copied this from other code in this driver, but it
doesn't look right to me.

> + spin_unlock_irqrestore(&priv_dev->lock, flags);

Why print while a spinlock is held? That's mean, you could have printed
here, right?

> +
> + return ret;
> +}
> +
> void cdns3_configure_dmult(struct cdns3_device *priv_dev,
> struct cdns3_endpoint *priv_ep)
> {
> @@ -1893,8 +1936,6 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
> struct usb_request *request;
> unsigned long flags;
> int ret = 0;
> - u32 ep_cfg;
> - int val;
>
> if (!ep) {
> pr_err("usbss: invalid parameters\n");
> @@ -1908,32 +1949,11 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
> "%s is already disabled\n", priv_ep->name))
> return 0;
>
> - spin_lock_irqsave(&priv_dev->lock, flags);
> -
> trace_cdns3_gadget_ep_disable(priv_ep);
>
> - cdns3_select_ep(priv_dev, ep->desc->bEndpointAddress);
> -
> - ep_cfg = readl(&priv_dev->regs->ep_cfg);
> - ep_cfg &= ~EP_CFG_ENABLE;
> - writel(ep_cfg, &priv_dev->regs->ep_cfg);
> -
> - /**
> - * Driver needs some time before resetting endpoint.
> - * It need waits for clearing DBUSY bit or for timeout expired.
> - * 10us is enough time for controller to stop transfer.
> - */
> - readl_poll_timeout_atomic(&priv_dev->regs->ep_sts, val,
> - !(val & EP_STS_DBUSY), 1, 10);
> - writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
> -
> - readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
> - !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
> - 1, 1000);
> - if (unlikely(ret))
> - dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
> - priv_ep->name);
> + cdns3_disable_reset_ep(priv_dev, priv_ep);
>
> + spin_lock_irqsave(&priv_dev->lock, flags);
> while (!list_empty(&priv_ep->pending_req_list)) {
> request = cdns3_next_request(&priv_ep->pending_req_list);
>
> @@ -1962,6 +1982,7 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
>
> ep->desc = NULL;
> priv_ep->flags &= ~EP_ENABLED;
> + priv_ep->flags |= EP_CLAIMED | EP_HW_RESETED;

Why do you now set EP_CLAIMED here when you never set that before? Is
this a different type of change?

thanks,

greg k-h