Re: [PATCH] PCI/MSI: pci-xgene-msi: Enable MSI support in ACPI boot for X-Gene v1

From: Duc Dang
Date: Fri Jun 03 2016 - 18:15:50 EST


On Fri, May 27, 2016 at 3:52 AM, Lorenzo Pieralisi
<lorenzo.pieralisi@xxxxxxx> wrote:
>
> [+ Rafael]
>
> On Thu, May 26, 2016 at 01:49:23PM -0700, Duc Dang wrote:
> > Hi Lorenzo,
> >
> > On Thu, May 26, 2016 at 5:34 AM, Lorenzo Pieralisi
> > <lorenzo.pieralisi@xxxxxxx> wrote:
> > > Hi Duc,
> > >
> > > On Wed, May 25, 2016 at 04:13:35PM -0700, Duc Dang wrote:
> > >> On Thu, Feb 25, 2016 at 9:38 AM, Lorenzo Pieralisi
> > >> <lorenzo.pieralisi@xxxxxxx> wrote:
> > >> > On Wed, Feb 24, 2016 at 02:28:10PM -0800, Duc Dang wrote:
> > >> >> On Wed, Feb 24, 2016 at 8:16 AM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote:
> > >> >> > On 24/02/16 16:09, Mark Salter wrote:
> > >> >> >> On Tue, 2016-02-09 at 17:56 -0800, Duc Dang wrote:
> > >> >> >>> This patch makes pci-xgene-msi driver ACPI-aware and provides
> > >> >> >>> MSI capability for X-Gene v1 PCIe controllers in ACPI boot mode.
> > >> >> >>>
> > >> >> >>> Signed-off-by: Duc Dang <dhdang@xxxxxxx>
> > >> >> >>> ---
> > >> >> >>> drivers/pci/host/pci-xgene-msi.c | 35 ++++++++++++++++++++++++++++++++---
> > >> >> >>> 1 file changed, 32 insertions(+), 3 deletions(-)
> > >> >> >>>
> > >> >> >>> diff --git a/drivers/pci/host/pci-xgene-msi.c b/drivers/pci/host/pci-xgene-msi.c
> > >> >> >>> index a6456b5..466aa93 100644
> > >> >> >>> --- a/drivers/pci/host/pci-xgene-msi.c
> > >> >> >>> +++ b/drivers/pci/host/pci-xgene-msi.c
> > >> >> >>> @@ -24,6 +24,7 @@
> > >> >> >>> #include <linux/pci.h>
> > >> >> >>> #include <linux/platform_device.h>
> > >> >> >>> #include <linux/of_pci.h>
> > >> >> >>> +#include <linux/acpi.h>
> > >> >> >>>
> > >> >> >>> #define MSI_IR0 0x000000
> > >> >> >>> #define MSI_INT0 0x800000
> > >> >> >>> @@ -39,7 +40,7 @@ struct xgene_msi_group {
> > >> >> >>> };
> > >> >> >>>
> > >> >> >>> struct xgene_msi {
> > >> >> >>> - struct device_node *node;
> > >> >> >>> + struct fwnode_handle *fwnode;
> > >> >> >>> struct irq_domain *inner_domain;
> > >> >> >>> struct irq_domain *msi_domain;
> > >> >> >>> u64 msi_addr;
> > >> >> >>> @@ -249,6 +250,13 @@ static const struct irq_domain_ops msi_domain_ops = {
> > >> >> >>> .free = xgene_irq_domain_free,
> > >> >> >>> };
> > >> >> >>>
> > >> >> >>> +#ifdef CONFIG_ACPI
> > >> >> >>> +static struct fwnode_handle *xgene_msi_get_fwnode(struct device *dev)
> > >> >> >>> +{
> > >> >> >>> + return xgene_msi_ctrl.fwnode;
> > >> >> >>> +}
> > >> >> >>> +#endif
> > >> >> >>> +
> > >> >> >>> static int xgene_allocate_domains(struct xgene_msi *msi)
> > >> >> >>> {
> > >> >> >>> msi->inner_domain = irq_domain_add_linear(NULL, NR_MSI_VEC,
> > >> >> >>> @@ -256,7 +264,7 @@ static int xgene_allocate_domains(struct xgene_msi *msi)
> > >> >> >>> if (!msi->inner_domain)
> > >> >> >>> return -ENOMEM;
> > >> >> >>>
> > >> >> >>> - msi->msi_domain = pci_msi_create_irq_domain(of_node_to_fwnode(msi->node),
> > >> >> >>> + msi->msi_domain = pci_msi_create_irq_domain(msi->fwnode,
> > >> >> >>> &xgene_msi_domain_info,
> > >> >> >>> msi->inner_domain);
> > >> >> >>
> > >> >> >> This doesn't work for me (ACPI probing on Mustang) unless I change this
> > >> >> >> to be pci_msi_create_default_irq_domain(). The problem seems to be that
> > >> >> >> the MSI probe happens after the PCIe RC is probed so there is no MSI domain
> > >> >> >> at the time the PCIe root is initialized by ACPI.
> > >> >> >
> > >> >> > pci_msi_create_default_irq_domain is the wrong thing do use, specially
> > >> >> > if you have multiple MSI controllers in the system. I certainly wouldn't
> > >> >> > want to see it being used on arm64.
> > >> >> >
> > >> >> > This is the usual dependency hell. You try moving the probing earlier,
> > >> >> > but that may break something else in the process.
> > >> >>
> > >> >> Hi Mark and Marc,
> > >> >>
> > >> >> I have modified Tianocore firmware to have MSI node declared before
> > >> >> PCIe node in Dsdt table. With this modification, the MSI driver will
> > >> >> be loaded before PCIe driver and MSI domain is available at the time
> > >> >> PCIe root is initialized.
> > >> >
> > >> > I am totally against this. We should not hack ACPI tables to make
> > >> > the kernel work (on top of that with unwritten ordering rules that may
> > >> > well require changes as kernel evolves), we should have a standard set
> > >> > of bindings that people use to describe HW in the DSDT and the kernel(s)
> > >> > has to cope with that. If there is a dependency problem in the description
> > >> > we may solve it at bindings level, but I absolutely do not want to rely
> > >> > on DSDT nodes ordering for things to work, that's fragile, no shortcuts
> > >> > please.
> > >>
> > >> Hi Lorenzo, Bjorn,
> > >>
> > >> (Refresh this thread on this merge cycle)
> > >>
> > >> I tried to use _DEP method to describe the dependency of PCIe node to
> > >> MSI node but it turns out that PCIe driver does not check for the
> > >> dependency (only EC (ec.c) and I2C core (i2c-core.c) use
> > >> acpi_walk_dep_device_list to check for dependency before loading
> > >> dependent drivers)
> > >>
> > >> Do you have other suggestion/example about how to ensure strict
> > >> ordering on driver loading in ACPI boot mode without changing the
> > >> order of ACPI nodes in DSDT?
> > >
> > > Before doing anything else, I would like to ask you to report the code
> > > paths leading to failures (what fails actually ?) please, and I want
> > > to understand what "this does not work" means.
> > >
> > > In particular:
> > >
> > > - Why using pci_msi_create_default_irq_domain() works
> > > - Why, if you change the DSDT nodes ordering, things work (please add
> > > some debugging to MSI allocation functions to detect the code paths
> > > leading to correct MSI allocation)
> > > - What's the difference wrt the DT probe path
> >
> > The MSI allocation happens with following code path (I use PMC pm80xx
> > SAS driver as an example (drivers/scsi/pm8001/pm8001_init.c)):
> > pci_enable_msix_exact
> > pci_enable_msix_range
> > pci_enable_msix
> > msix_capbility_init
> > pci_msi_setup_msi_irqs
> > pci_msi_get_domain
> > dev_get_msi_domain
> > arch_get_pci_msi_domain
> > return pci_msi_default_domain;
> >
> > In my failing case: pci_msi_get_domain return NULL, so MSI allocation
> > function (pci_enable_msix_exact) fails.
> >
> > The expected MSI domain is registered when X-Gene MSI driver is
> > initialized and is attached to each PCI bus by calling
> > pci_set_bus_msi_domain during PCIe bus-scan:
> > pci_scan_root_bus
> > pci_scan_root_bus_msi
> > pci_scan_child_bus
> > pci_scan_bridge
> > pci_add_new_bus
> > pci_alloc_child_bus
> > pci_set_bus_msi_domain
> > pci_host_bridge_msi_domain
> > pci_host_bridge_of_msi_domain
> > pci_host_bridge_acpi_msi_domain
> >
> > + When booting with DT: X-Gene MSI driver is initialized early
> > (subsys_initcall level); X-Gene PCIe driver is initialized later at
> > device_initcall level (using module_platform_driver). Because of that
> > when PCIe enumeration happens, the X-Gene MSI domain was already
> > registered and available. So when the driver of PCIe Endpoint devices
> > gets loaded, the call to pci_enable_msix_exact (to allocate MSI)
> > succeeds.
> >
> > + When booting with ACPI: acpi_scan_init is called at subsys_initcall
> > level (the same level with X-Gene MSI driver):
> > acpi_init
> > acpi_bus_init (GIC is initialized here)
> > acpi_scan_init
> > acpi_pci_root_init
> > acpi_pci_link_init
> > ...
> > acpi_bus_scan
> > acpi_bus_check_add
> > ...
> >
> > Currently, in DSDT table, X-Gene MSI node is defined right after
> > X-Gene PCIe nodes. So PCIe driver initialization and PCIe enumeration
> > happens before MSI driver is initialized. During PCIe bus scan, the
> > pci_set_bus_msi_domain gets a NULL irq_domain value from
> > pci_host_bridge_msi_domain (as MSI driver is not initialized), so
> > there is no MSI support for PCIe bus.
> >
> > + Using pci_msi_create_default_irq_domain() helps MSI allocation
> > succeed as pci_msi_get_domain (in the code path of
> > pci_enable_msix_exact above) can fall back to get a valid domain from
> > arch_get_pci_msi_domain (please note that PCIe endpoint driver gets
> > called later (at device_initcall level (using module_init)), so at the
> > time PCIe endpoint driver probe function executes, the MSI driver
> > already register the global MSI domain with
> > pci_msi_create_default_irq_domain)
> >
> > + If changing DSDT node ordering, to define MSI node before PCIe
> > nodes, then X-Gene MSI driver will be loaded before PCIe host driver.
> > At the time PCIe enumeration happens, X-Gene MSI driver already
> > registered a valid MSI domain (by calling
> > pci_msi_register_fwnode_provider to register the fwnode provider
> > function), so all the detected PCIe buses (and PCIe endpoint devices)
> > will have a valid MSI domain attached to them (as a result of function
> > pci_set_bus_msi_domain)
>
> This still works out of sheer luck, given that relying on initcall
> ordering at same init level is undefined (ie it works if your msi driver
> registration call is executed before acpi_init() is called).
>
> I though about using a scan handler like:
>
> See arch/ia64/hp/common/sba_iommu.c acpi_sba_ioc_handler
>
> but that won't solve your problem either since, given that PNP0A03 (ie
> PCI host bridge) is handled through a scan handler too, IIUC you will
> still have issues related to namespace device nodes ordering.
>
> I suspect we will have to add a hook walking the ACPI namespace before
> acpi_bus_scan() is called in acpi_scan_init() (aka MSI controllers must
> be probed before any other device is, that's the gist), I CC'ed Rafael
> to see if there is any other suitable way to sort out this issue.

Thanks for your input, Lorenzo.

Hi Rafael,

Do you have other suggestions? Otherwise, I will prepare a patch
following Lorenzo's approach.

Regards,
Duc Dang.
>
> Thanks,
> Lorenzo