Re: [REGRESSION] Re: imx8 PCI regression since "iommu: Get DT/ACPI parsing into the proper probe path"
From: Robin Murphy
Date: Fri Jan 16 2026 - 12:35:38 EST
On 2026-01-16 5:10 pm, Jason Gunthorpe wrote:
On Fri, Jan 16, 2026 at 05:52:36PM +0100, Nicolas Cavallari wrote:
I debugged it further, it seems to be mostly a PCI issue since the system
does not actually have an IOMMU.
When examining changes in the PCI configuration (lspci -vvvv), the main
difference is that, with the patch, Access Control Services are enabled on
the PCI switch.
Capabilities: [220 v1] Access Control Services
ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+
UpstreamFwd+ EgressCtrl+ DirectTrans+
- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir-
UpstreamFwd- EgressCtrl- DirectTrans-
+ ACSCtl: SrcValid+ TransBlk- ReqRedir+ CmpltRedir+
UpstreamFwd+ EgressCtrl- DirectTrans-
If I manually patch the config space in sysfs and re-disable ACS on the port
connected to the LAN7430, I cannot reproduce the problem. In fact,
disabling only ReqRedir is enough to work around the issue.
My guess would be your system has some kind of address alias going on?
Assuming you are not facing an errata, ACS generally changes the
routing of TLPs so if you have a DMA address that could go to two
different places then messing with ACS will give you different
behaviors.
In specific when you turn all those ACS settings you cannot do P2P
traffic anymore. If your system expects this for some reason then you
must use the kernel command line option to disable acs.
If you are just doing normal netdev stuff then it is doubtful that you
are doing P2P at all, so I might guess a bug in the microchip ethernet
driver doing a wild DMA? Stricter ACS settings cause it to AER and the
device cannot recover?
It will be hard to get the bottom of the defect without a PCI trace
I don't know why your bisection landed on bcb8 - the intention was
that pci_enable_acs() is always called, and I didn't notice an obvious
reason why that wouldn't happen prior to bcb8.. It is called directly
from pci_device_add() Maybe investigating that angle would be
informative..
The difference is that bcb8 moves the pci_request_acs() call on OF systems back early enough to actually have an effect - that's spent the last 6 years being pretty much a no-op since 6bf6c24720d3 ("iommu/of: Request ACS from the PCI core when configuring IOMMU linkage")...
Thanks,
Robin.
I also read up on AER and I'm surprised that I don't see anything in dmesg
when the problem occurs, even through UERcvd+ start appearing on the root
context and AdvNonFatalErr+ appears on the switch.
Though UE and AdvNonFatalErr sure are weird indications for an
addressing error.. Is there some kind of special embedded system thing
going on? Vendor messages over PCI perhaps?
Jason