Re: [Query] ACS enablement in the DT based boot flow

From: Pavan Kondeti
Date: Thu Jul 18 2024 - 06:13:47 EST


Hi Vidya/Will,

On Sun, Apr 28, 2024 at 08:23:18AM +0100, Will Deacon wrote:
> On Wed, Apr 10, 2024 at 02:28:40PM -0500, Bjorn Helgaas wrote:
> > [+cc Will, Joerg]
> >
> > On Mon, Apr 01, 2024 at 10:40:15AM +0000, Vidya Sagar wrote:
> > > Hi folks,
> > > ACS (Access Control Services) is configured for a PCI device through
> > > pci_enable_acs(). The first thing pci_enable_acs() checks for is
> > > whether the global flag 'pci_acs_enable' is set or not. The global
> > > flag 'pci_acs_enable' is set by the function pci_request_acs().
> > >
> > > pci_enable_acs() function is called whenever a new PCI device is
> > > added to the system
> > >
> > > pci_enable_acs+0x4c/0x2a4
> > > pci_acs_init+0x38/0x60
> > > pci_device_add+0x1a0/0x670
> > > pci_scan_single_device+0xc4/0x100
> > > pci_scan_slot+0x6c/0x1e0
> > > pci_scan_child_bus_extend+0x48/0x2e0
> > > pci_scan_root_bus_bridge+0x64/0xf0
> > > pci_host_probe+0x18/0xd0
> > >
> > > In the case of a system that boots using device-tree blob,
> > > pci_request_acs() is called when the device driver binds with the
> > > respective device
> > >
> > > of_iommu_configure+0xf4/0x230
> > > of_dma_configure_id+0x110/0x340
> > > pci_dma_configure+0x54/0x120
> > > really_probe+0x80/0x3e0
> > > __driver_probe_device+0x88/0x1c0
> > > driver_probe_device+0x3c/0x140
> > > __device_attach_driver+0xe8/0x1e0
> > > bus_for_each_drv+0x78/0xf0
> > > __device_attach+0x104/0x1e0
> > > device_attach+0x14/0x30
> > > pci_bus_add_device+0x50/0xd0
> > > pci_bus_add_devices+0x38/0x90
> > > pci_host_probe+0x40/0xd0
> > >
> > > Since the device addition always happens first followed by the
> > > driver binding, this flow effectively makes sure that ACS never gets
> > > enabled.
> > >
> > > Ideally, I would expect the pci_request_acs() get called (probably
> > > by the OF framework itself) before calling pci_enable_acs().
> > >
> > > This happens in the ACPI flow where pci_request_acs() is called
> > > during IORT node initialization (i.e. iort_init_platform_devices()
> > > function).
> > >
> > > Is this understanding correct? If yes, would it make sense to call
> > > pci_request_acs() during OF initialization (similar to IORT
> > > initialization in ACPI flow)?
> >
> > Your understanding looks correct to me. My call graph notes, FWIW:
> >
> > mem_init
> > pci_iommu_alloc # x86 only
> > amd_iommu_detect # init_state = IOMMU_START_STATE
> > iommu_go_to_state(IOMMU_IVRS_DETECTED)
> > state_next
> > switch (init_state)
> > case IOMMU_START_STATE:
> > detect_ivrs
> > pci_request_acs
> > pci_acs_enable = 1 # <--
> > detect_intel_iommu
> > pci_request_acs
> > pci_acs_enable = 1 # <--
> >
> > pci_scan_single_device # PCI enumeration
> > ...
> > pci_init_capabilities
> > pci_acs_init
> > pci_enable_acs
> > if (pci_acs_enable) # <--
> > pci_std_enable_acs
> >
> > __driver_probe_device
> > really_probe
> > pci_dma_configure # pci_bus_type.dma_configure
> > if (OF)
> > of_dma_configure
> > of_dma_configure_id
> > of_iommu_configure
> > pci_request_acs # <-- 6bf6c24720d3
> > iommu_probe_device
> > else if (ACPI)
> > acpi_dma_configure
> > acpi_dma_configure_id
> > acpi_iommu_configure_id
> > iommu_probe_device
> >
> > The pci_request_acs() in of_iommu_configure(), which happens too late
> > to affect pci_enable_acs(), was added by 6bf6c24720d3 ("iommu/of:
> > Request ACS from the PCI core when configuring IOMMU linkage"), so I
> > cc'd Will and Joerg. I don't know if that *used* to work and got
> > broken somehow, or if it never worked as intended.
>
> I don't have any way to test this, but I'm supportive of having the same
> flow for DT and ACPI-based flows. Vidya, are you able to cook a patch?
>

I ran into a similar observation while testing a PCI device assignment
to a VM. In my configuration, the virtio-iommu is enumerated over the
PCI transport. So, I am thinking we can't hook pci_request_acs() to an
IOMMU driver. Does the below patch makes sense?

The patch is tested with a VM and I could see ACS getting enabled and
separate IOMMU groups are created for the devices attached under
PCIe root port(s).

The RC/devices with ACS quirks are not suffering from this problem as we
short circuit ACS capability detection checking in
pci_acs_enabled()->pci_dev_specific_acs_enabled() . May be this is one
of the reason why this was not reported/observed by some platforms with
DT.

diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index b908fe1ae951..0eeb7abfbcfa 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -123,6 +123,13 @@ bool pci_host_of_has_msi_map(struct device *dev)
return false;
}

+bool pci_host_of_has_iommu_map(struct device *dev)
+{
+ if (dev && dev->of_node)
+ return of_get_property(dev->of_node, "iommu-map", NULL);
+ return false;
+}
+
static inline int __of_pci_pci_compare(struct device_node *node,
unsigned int data)
{
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4c367f13acdc..ea6fcdaf63e2 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -889,6 +889,7 @@ static void pci_set_bus_msi_domain(struct pci_bus *bus)
dev_set_msi_domain(&bus->dev, d);
}

+bool pci_host_of_has_iommu(struct device *dev);
static int pci_register_host_bridge(struct pci_host_bridge *bridge)
{
struct device *parent = bridge->dev.parent;
@@ -951,6 +952,9 @@ static int pci_register_host_bridge(struct pci_host_bridge *bridge)
!pci_host_of_has_msi_map(parent))
bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;

+ if (pci_host_of_has_iommu_map(parent))
+ pci_request_acs();
+
if (!parent)
set_dev_node(bus->bridge, pcibus_to_node(bus));

diff --git a/include/linux/pci.h b/include/linux/pci.h
index cafc5ab1cbcb..7eceed71236a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2571,6 +2571,7 @@ struct device_node;
struct irq_domain;
struct irq_domain *pci_host_bridge_of_msi_domain(struct pci_bus *bus);
bool pci_host_of_has_msi_map(struct device *dev);
+bool pci_host_of_has_iommu_map(struct device *dev);

/* Arch may override this (weak) */
struct device_node *pcibios_get_phb_of_node(struct pci_bus *bus);
@@ -2579,6 +2580,7 @@ struct device_node *pcibios_get_phb_of_node(struct pci_bus *bus);
static inline struct irq_domain *
pci_host_bridge_of_msi_domain(struct pci_bus *bus) { return NULL; }
static inline bool pci_host_of_has_msi_map(struct device *dev) { return false; }
+static inline bool pci_host_of_has_iommu_map(struct device *dev) { return false; }
#endif /* CONFIG_OF */

static inline struct device_node *

Thanks,
Pavan