Re: pcie: xilinx: kernel hang - ISR readl()

From: Muni Sekhar
Date: Fri Jan 31 2020 - 11:35:02 EST


On Fri, Jan 31, 2020 at 12:30 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Thu, Jan 30, 2020 at 09:37:48PM +0530, Muni Sekhar wrote:
> > On Thu, Jan 9, 2020 at 10:05 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > >
> > > On Thu, Jan 09, 2020 at 08:47:51AM +0530, Muni Sekhar wrote:
> > > > On Thu, Jan 9, 2020 at 1:45 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > > > > On Tue, Jan 07, 2020 at 09:45:13PM +0530, Muni Sekhar wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I have module with Xilinx FPGA. It implements UART(s), SPI(s),
> > > > > > parallel I/O and interfaces them to the Host CPU via PCI Express bus.
> > > > > > I see that my system freezes without capturing the crash dump for
> > > > > > certain tests. I debugged this issue and it was tracked down to the
> > > > > > below mentioned interrupt handler code.
> > > > > >
> > > > > >
> > > > > > In ISR, first reads the Interrupt Status register using âreadl()â as
> > > > > > given below.
> > > > > > status = readl(ctrl->reg + INT_STATUS);
> > > > > >
> > > > > >
> > > > > > And then clears the pending interrupts using âwritel()â as given blow.
> > > > > > writel(status, ctrl->reg + INT_STATUS);
> > > > > >
> > > > > >
> > > > > > I've noticed a kernel hang if INT_STATUS register read again after
> > > > > > clearing the pending interrupts.
> > > > > >
> > > > > > Can someone clarify me why the kernel hangs without crash dump incase
> > > > > > if I read the INT_STATUS register using readl() after clearing the
> > > > > > pending bits?
> > > > > >
> > > > > > Can readl() block?
> > > > >
> > > > > readl() should not block in software. Obviously at the hardware CPU
> > > > > instruction level, the read instruction has to wait for the result of
> > > > > the read. Since that data is provided by the device, i.e., your FPGA,
> > > > > it's possible there's a problem there.
> > > >
> > > > Thank you very much for your reply.
> > > > Where can I find the details about what is protocol for reading the
> > > > âmemory mapped IOâ? Can you point me to any useful links..
> > > > I tried locate the exact point of the kernel code where CPU waits for
> > > > read instruction as given below.
> > > > readl() -> __raw_readl() -> return *(const volatile u32 __force *)add
> > > > Do I need to check for the assembly instructions, here?
> > >
> > > The C pointer dereference, e.g., "*address", will be some sort of a
> > > "load" instruction in assembly. The CPU wait isn't explicit; it's
> > > just that when you load a value, the CPU waits for the value.
> > >
> > > > > Can you tell whether the FPGA has received the Memory Read for
> > > > > INT_STATUS and sent the completion?
> > > >
> > > > Is there a way to know this with the help of software debugging(either
> > > > enabling dynamic debugging or adding new debug prints)? Can you please
> > > > point some tools\hw needed to find this?
> > >
> > > You could learn this either via a PCIe analyzer (expensive piece of
> > > hardware) or possibly some logic in the FPGA that would log PCIe
> > > transactions in a buffer and make them accessible via some other
> > > interface (you mentioned it had parallel and other interfaces).
> > >
> > > > > On the architectures I'm familiar with, if a device doesn't respond,
> > > > > something would eventually time out so the CPU doesn't wait forever.
> > > >
> > > > What is timeout here? I mean how long CPU waits for completion? Since
> > > > this code runs from interrupt context, does it causes the system to
> > > > freeze if timeout is more?
> > >
> > > The Root Port should have a Completion Timeout. This is required by
> > > the PCIe spec. The *reporting* of the timeout is somewhat
> > > implementation-specific since the reporting is outside the PCIe
> > > domain. I don't know the duration of the timeout, but it certainly
> > > shouldn't be long enough to look like a "system freeze".
> > Does kernel writes to PCIe configuration space register âDevice
> > Control 2 Registerâ (Offset 0x28)? When I tried to read this register,
> > I noticed bit 4 is set (which disables completion timeouts) and rest
> > all other bits are zero. So, Completion Timeout detection mechanism is
> > disabled, right? If so what could be the reason for this?
>
> To my knowledge Linux doesn't set PCI_EXP_DEVCTL2_COMP_TMOUT_DIS
> except for one powerpc case. You can check yourself by using cscope
> or grep to look for PCI_EXP_DEVCTL2_COMP_TMOUT_DIS or PCI_EXP_DEVCTL2.
>
> If you're seeing bit 4 (PCI_EXP_DEVCTL2_COMP_TMOUT_DIS) set, it's
> likely because firmware set it. You can try booting with
> "pci=earlydump" to see what's there before Linux starts changing
> things.

[ 0.000000] pci 0000:01:00.0 config space:

00: 56 15 55 55 07 00 10 00 00 00 00 05 10 00 00 00
10: 00 00 40 d0 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00
30: 00 00 00 00 40 00 00 00 00 00 00 00 0b 01 00 00
40: 01 48 03 00 08 00 00 00 05 60 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 10 00 02 00 c2 8f 00 00 00 28 01 00 21 f4 03 00
70: 01 00 21 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 02 00 00 00 10 00 00 00 00 00 00 00
90: 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


Device Control 2" is located @offset 0x28 in PCI Express Capability
Structure. But where does 'PCI Express Capability Structure' located
in the above mentioned 'PCI Express Configuration Space'?
>
> Bjorn



--
Thanks,
Sekhar