Re: [PATCH] PCI: dwc: designware: don't sleep in atomic context

From: Bjorn Helgaas
Date: Mon Nov 06 2017 - 14:30:27 EST


On Fri, Oct 20, 2017 at 01:19:17PM -0500, Bjorn Helgaas wrote:
> On Fri, Oct 13, 2017 at 09:10:38AM +0530, Pankaj Dubey wrote:
> >
> >
> > On 10/12/2017 04:09 PM, David Laight wrote:
> > >From: Pankaj Dubey
> > >>Sent: 12 October 2017 08:55
> > >>In pcie-designware.c many places we are calling "usleep_range" which
> > >>are in atomic context. This patch fixes these potential BUGs and
> > >>replaces "usleep_range" with mdelay calls.
> > >>
> > >>Signed-off-by: Pankaj Dubey <pankaj.dubey@xxxxxxxxxxx>
> > >>---
> > >> drivers/pci/dwc/pcie-designware.c | 8 ++++----
> > >> drivers/pci/dwc/pcie-designware.h | 3 +--
> > >> 2 files changed, 5 insertions(+), 6 deletions(-)
> > >>
> > >>diff --git a/drivers/pci/dwc/pcie-designware.c b/drivers/pci/dwc/pcie-designware.c
> > >>index 88abddd..35d19b9 100644
> > >>--- a/drivers/pci/dwc/pcie-designware.c
> > >>+++ b/drivers/pci/dwc/pcie-designware.c
> > >>@@ -138,7 +138,7 @@ static void dw_pcie_prog_outbound_atu_unroll(struct dw_pcie *pci, int index,
> > >> if (val & PCIE_ATU_ENABLE)
> > >> return;
> > >>
> > >>- usleep_range(LINK_WAIT_IATU_MIN, LINK_WAIT_IATU_MAX);
> > >>+ mdelay(LINK_WAIT_IATU_MIN);
> > >> }
> > >Spinning for 9ms (possibly 10 times) isn't really a good idea.
> >
> > Yes. It may not be a good idea, however in our experiment it never
> > hit maximum retry count. I just converted usleep_range to mdelay
> > keeping min time limitation as it is, though I am not sure, how do
> > we arrived on these numbers in original code, may be Joao Pinto from
> > Synopsys have some idea, I will try to do few experiment and try to
> > find out what is sufficient minimum time in our hardware for these
> > mdelay.
>
> Just based on the preceding comment, it looks like the wait is
> essential because subsequent config and I/O accesses won't work
> correctly until the ATU enable takes effect.
>
> If we timeout here, I suspect it's because something is seriously
> wrong in the hardware, so I doubt there's any point in trying to
> minimize the timeout period. If something is that broken, it doesn't
> matter whether we wait 9ms or 900ms.
>
> Maybe the message should be more strident or maybe we should even
> return failure so the caller can do something, e.g., fail an access,
> instead of just printing an error and continuing on.
>
> I'm also looking for an ack from Joao and/or Jingoo.

Dropping for lack of ack.