Re: [PATCH 3/3] PCI: Xilinx NWL PCIe: Fix Error for multi function device for legacy interrupts.

From: Bjorn Helgaas
Date: Mon Sep 12 2016 - 18:02:52 EST


On Thu, Sep 01, 2016 at 05:19:55AM +0000, Bharat Kumar Gogada wrote:
> > >>>> Hi Bharat,
> > >>>>> @@ -561,7 +561,7 @@ static int nwl_pcie_init_irq_domain(struct
> > >>>>> nwl_pcie
> > >>>> *pcie)
> > >>>>> }
> > >>>>>
> > >>>>> pcie->legacy_irq_domain = irq_domain_add_linear(legacy_intc_node,
> > >>>>> - INTX_NUM,
> > >>>>> + INTX_NUM + 1,
> > >>>>> &legacy_domain_ops,
> > >>>>> pcie);
> > >>>>
> > >>>> This feels like the wrong thing to do. You have INTX_NUM irqs, so
> > >>>> the domain allocation should reflect this. On the other hand, the
> > >>>> way the driver currently deals with mappings is quite broken
> > >>>> (consistently adding 1 to
> > >> the HW interrupt).
> > >>>>
> > >>> Hi Marc,
> > >>>
> > >>> Without above change I get following crash in kernel while booting.
> > >>>
> > >>> [ 2.441684] error: hwirq 0x4 is too large for dummy
> > >>>
> > >>> [ 2.441694] ------------[ cut here ]------------
> > >>>
> > >>> [ 2.441698] WARNING: at kernel/irq/irqdomain.c:344
> > >>>
> > >>> [ 2.441702] Modules linked in:
> > >>>
> > >>> [ 2.441706]
> > >>>
> > >>> [ 2.441714] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.4.0 #8
> > >>>
> > >>> [ 2.441718] Hardware name: xlnx,zynqmp (DT)
> > >>>
> > >>> [ 2.441723] task: ffffffc071886b80 ti: ffffffc071888000 task.ti:
> > >> ffffffc071888000
> > >>>
> > >>> [ 2.441732] PC is at irq_domain_associate+0x138/0x1c0
> > >>>
> > >>> [ 2.441738] LR is at irq_domain_associate+0x138/0x1c0
> > >>>
> > >>> In kernel/irq/irqdomain.c function irq_domain_associate
> > >>>
> > >>> if (WARN(hwirq >= domain->hwirq_max,
> > >>> "error: hwirq 0x%x is too large for %s\n", (int)hwirq, domain-
> > >name))
> > >>> return -EINVAL;
> > >>>
> > >>> Here the hwirq and hwirq_max are equal to 4 without the above
> > >>> condition
> > >> (INTX_NUM + 1) due to which crash is coming.
> > >>> This is happening as the legacy interrupts are starting from 1 (INTA).
> > >>
> > >> I understood that. I'm still persisting in saying that you have the wrong fix.
> > >>
> > >> Your domain should always allocate many interrupts as you have
> > >> interrupt sources. These interrupts (hwirq) should be numbered from 0 to (n-
> > 1).
> > >
> > > Agreed, but here comes the problem the hwirq for legacy interrupts
> > > will start at 0x1 to 0x4 (INTA to INTD) and these values are as per
> > > PCIe specification for legacy interrupts. So these cannot be numbered
> > > from 0. So when 0x4 (INTD) for a multi-function device comes the crash
> > > occurs.
> >
> > So who provides this hwirq? Who calls irq_domain_associate() with hwirq set to
> > 4?
> >
> PCIe subsystem invokes pcibios_add_device function in arch/arm64/kernel/pci.c for every pci device.
> The purpose of this function is to assign dev->irq using of_irq_parse_and_map_pci.
> of_irq_parse_and_map_pci invokes of_irq_parse_pci where it reads PCI_INTERRUPT_PIN from configuration space and saves it
> in parameter of struct of_phandle_args.
> This structure is passed to irq_create_of_mapping where it invokes irq_create_fwspec_mapping.
> irq_create_fwspec_mapping invokes irq_domain_translate and gets hwirq, here the above saved PCI_INTERRUPT_PIN value is assigned
> to hwirq (*hwirq = fwspec->param[0]).
> And then using this hwirq irq_create_mapping -> irq_domain_associate were invoked and mapping is created for virtual irq with this hwirq.
> So for any end point PCI_INTERRUPT_PIN value starts from 0x1 to 0x4 and so hwirq starts from 0x1 to 0x4.
>
> So the values are more generic w.r.t to protocol, that's why hwirq will range from 0x1 to 0x4.
> And then if you check pcie-altera.c they are doing this adding one in their handler and while creating legacy domain.

Is this resolved yet? Marc, are you happy, or should we iterate on this
again?