Re: [PATCHv9 00/12] PCI: Recode Mobiveil driver and add PCIe Gen4 driver for NXP Layerscape SoCs

From: Olof Johansson
Date: Tue Feb 11 2020 - 09:48:45 EST


On Tue, Feb 11, 2020 at 5:04 AM Robin Murphy <robin.murphy@xxxxxxx> wrote:
>
> On 2020-02-11 12:13 pm, Laurentiu Tudor wrote:
> [...]
> >> This is a known issue about DPAA2 MC bus not working well with SMMU
> >> based IO mapping. Adding Laurentiu to the chain who has been looking
> >> into this issue.
> >
> > Yes, I'm closely following the issue. I actually have a workaround
> > (attached) but haven't submitted as it will probably raise a lot of
> > eyebrows. In the mean time I'm following some discussions [1][2][3] on
> > the iommu list which seem to try to tackle what appears to be a similar
> > issue but with framebuffers. My hope is that we will be able to leverage
> > whatever turns out.
>
> Indeed it's more general than framebuffers - in fact there was a
> specific requirement from the IORT side to accommodate network/storage
> controllers with in-memory firmware/configuration data/whatever set up
> by the bootloader that want to be handed off 'live' to Linux because the
> overhead of stopping and restarting them is impractical. Thus this DPAA2
> setup is very much within scope of the desired solution, so please feel
> free to join in (particularly on the DT parts) :)

That's a real problem that nees a solution, but that's not what's
happening here, since cold boots works fine.

Isn't it a whole lot more likely that something isn't
reset/reinitialized properly in u-boot, such that there is lingering
state in the setup, causing this?

> As for right now, note that your patch would only be a partial
> mitigation to slightly reduce the fault window but not remove it
> entirely. To be robust the SMMU driver *has* to know about live streams
> before the first arm_smmu_reset() - hence the need for generic firmware
> bindings - so doing anything from the MC driver is already too late (and
> indeed the current iommu_request_dm_for_dev() mechanism is itself a
> microcosm of the same problem).

This is more likely a live stream that's left behind from the previous
kernel (there are some error messages about being unable to detach
domains, but the errors make it hard to tell what driver didn't unbind
enough).

*BUT*, even with that bug, the system should reboot reliably and come
up clean. So, something isn't clearing up the state *on boot*.

> > In the mean time, can you try the workaround Leo suggested?
>
> Agreed, I'd imagine the command-line option is probably the best choice
> for these platforms, since it's likely to be easier to set that by
> default in the bootloader than faff with rebuilding generic kernel configs.

For the generic user, definitely. I'll give it a go later this week
when I have a bit more spare time with the device physically present.


-Olof