Re: [PATCH v5 08/12] PCI: imx6: Config look up table(LUT) to support MSI ITS and IOMMU for i.MX95

From: Frank Li
Date: Thu Jun 06 2024 - 16:24:47 EST


On Mon, Jun 03, 2024 at 04:07:55PM -0400, Frank Li wrote:
> On Mon, Jun 03, 2024 at 01:56:27PM -0500, Bjorn Helgaas wrote:
> > On Mon, Jun 03, 2024 at 02:42:45PM -0400, Frank Li wrote:
> > > On Mon, Jun 03, 2024 at 12:19:21PM -0500, Bjorn Helgaas wrote:
> > > > On Fri, May 31, 2024 at 03:58:49PM +0100, Robin Murphy wrote:
> > > > > On 2024-05-31 12:08 am, Bjorn Helgaas wrote:
> > > > > > [+cc IOMMU and pcie-apple.c folks for comment]
> > > > > >
> > > > > > On Tue, May 28, 2024 at 03:39:21PM -0400, Frank Li wrote:
> > > > > > > For the i.MX95, configuration of a LUT is necessary to convert Bus Device
> > > > > > > Function (BDF) to stream IDs, which are utilized by both IOMMU and ITS.
> > > > > > > This involves examining the msi-map and smmu-map to ensure consistent
> > > > > > > mapping of PCI BDF to the same stream IDs. Subsequently, LUT-related
> > > > > > > registers are configured. In the absence of an msi-map, the built-in MSI
> > > > > > > controller is utilized as a fallback.
> > > > > > >
> > > > > > > Additionally, register a PCI bus notifier to trigger imx_pcie_add_device()
> > > > > > > upon the appearance of a new PCI device and when the bus is an iMX6 PCI
> > > > > > > controller. This function configures the correct LUT based on Device Tree
> > > > > > > Settings (DTS).
> > > > > >
> > > > > > This scheme is pretty similar to apple_pcie_bus_notifier(). If we
> > > > > > have to do this, I wish it were *more* similar, i.e., copy the
> > > > > > function names, bitmap tracking, code structure, etc.
> > > > > >
> > > > > > I don't really know how stream IDs work, but I assume they are used on
> > > > > > most or all arm64 platforms, so I'm a little surprised that of all the
> > > > > > PCI host drivers used on arm64, only pcie-apple.c and pci-imx6.c need
> > > > > > this notifier.
> > > > >
> > > > > This is one of those things that's mostly at the mercy of the PCIe root
> > > > > complex implementation. Typically the SMMU StreamID and/or GIC ITS DeviceID
> > > > > is derived directly from the PCI RID, sometimes with additional high-order
> > > > > bits hard-wired to disambiguate PCI segments. I believe this RID-translation
> > > > > LUT is a particular feature of the the Synopsys IP - I know there's also one
> > > > > on the NXP Layerscape platforms, but on those it's programmed by the
> > > > > bootloader, which also generates the appropriate "msi-map" and "iommu-map"
> > > > > properties to match. Ideally that's what i.MX should do as well, but hey.
> > > >
> > > > Maybe this RID-translation is a feature of i.MX, not of Synopsys? I
> > > > see that the LUT CSR accesses use IMX95_* definitions.
> > >
> > > Yes, it convert 16bit RID to 6bit stream id.
> >
> > IIUC, you're saying this is not a Synopsys feature, it's an i.MX
> > feature.
>
> Yes, it is i.MX feature. But I think other vendor should have similar
> situation if use old arm smmu.
>
> >
> > > > > If it's really necessary to do this programming from Linux, then there's
> > > > > still no point in it being dynamic - the mappings cannot ever change, since
> > > > > the rest of the kernel believes that what the DT said at boot time was
> > > > > already a property of the hardware. It would be a lot more logical, and
> > > > > likely simpler, for the driver to just read the relevant map property and
> > > > > program the entire LUT to match, all in one go at controller probe time.
> > > > > Rather like what's already commonly done with the parsing of "dma-ranges" to
> > > > > program address-translation LUTs for inbound windows.
> > > > >
> > > > > Plus that would also give a chance of safely dealing with bad DTs specifying
> > > > > invalid ID mappings (by refusing to probe at all). As it is, returning an
> > > > > error from a child's BUS_NOTIFY_ADD_DEVICE does nothing except prevent any
> > > > > further notifiers from running at that point - the device will still be
> > > > > added, allowed to bind a driver, and able to start sending DMA/MSI traffic
> > > > > without the controller being correctly programmed, which at best won't work
> > > > > and at worst may break the whole system.
> > > >
> > > > Frank, could the imx LUT be programmed once at boot-time instead of at
> > > > device-add time? I'm guessing maybe not because apparently there is a
> > > > risk of running out of LUT entries?
> > >
> > > It is not good idea to depend on boot loader so much.
> >
> > I meant "could this be programmed once when the Linux imx host
> > controller driver is probed?" But from the below, it sounds like
> > that's not possible in general because you don't have enough stream
> > IDs to do that.
>
> Oh! sorry miss understand what your means. It is possible like what I did
> at v3 version. But I think it is not good enough.
>
> >
> > > Some hot plug devics
> > > (SD7.0) may plug after system boot. Two PCIe instances shared one set
> > > of 6bits stream id (total 64). Assume total 16 assign to two PCIe
> > > controllers. each have 8 stream id. If use uboot assign it static, each
> > > PCIe controller have below 8 devices. It will be failrue one controller
> > > connect 7, another connect 9. but if dynamtic alloc when devices add, both
> > > controller can work.
> > >
> > > Although we have not so much devices now, this way give us possility to
> > > improve it in future.
> > >
> > > > It sounds like the consequences of running out of LUT entries are
> > > > catastrophic, e.g., memory corruption from mis-directed DMA? If
> > > > that's possible, I think we need to figure out how to prevent the
> > > > device from being used, not just dev_warn() about it.
> > >
> > > Yes, but so far, we have not met such problem now. We can improve it when
> > > we really face such problem.
> >
> > If this controller can only support DMA from a limited number of
> > endpoints below it, I think we should figure out how to enforce that
> > directly. Maybe we can prevent drivers from enabling bus mastering or
> > something. I'm not happy with the idea of waiting for and debugging a
> > report of data corruption.
>
> It may add a pre-add hook function to pci bridge. let me do more research.

Hi Bjorn:

int pci_setup_device(struct pci_dev *dev)
{
dev->error_state = pci_channel_io_normal;
...
pci_fixup_device(pci_fixup_early, dev);

^^^ I can add fixup hook for pci_fixup_early. If not resource,
I can set dev->error_state to pci_channel_io_frozen or
pci_channel_io_perm_failure

And add below check here after call hook function.

if (dev->error_state != pci_channel_io_normal)
return -EIO;

}

How do you think this method? If you agree, I can continue search device
remove hook up.

Frank

>
> Frank
>
> >
> > Bjorn