Re: [PATCH 0/4] PCI SMC conduit, now with DT support
From: Pali Rohár
Date: Thu Aug 18 2022 - 17:55:16 EST
On Tuesday 16 August 2022 08:59:05 Catalin Marinas wrote:
> Hi Jeremy,
>
> On Thu, Jul 28, 2022 at 12:20:55PM -0500, Jeremy Linton wrote:
> > On 7/26/22 06:40, Will Deacon wrote:
> > > On Mon, Jul 25, 2022 at 11:39:01AM -0500, Jeremy Linton wrote:
> > > > This is a rebase of the later revisions of [1], but refactored
> > > > slightly to add a DT method as well. It has all the same advantages of
> > > > the ACPI method (putting HW quirks in the firmware rather than the
> > > > kernel) but now applied to a 'pci-host-smc-generic' compatible
> > > > property which extends the pci-host-generic logic to handle cases
> > > > where the PCI Config region isn't ECAM compliant. With this in place,
> > > > and firmware managed clock/phy/etc its possible to run the generic
> > > > driver on hardware that isn't what one would consider standards
> > > > compliant PCI root ports.
> > >
> > > I still think that hiding the code in firmware because the hardware is
> > > broken is absolutely the wrong way to tackle this problem and I thought
> > > the general idea from last time was that we were going to teach Linux
> > > about the broken hardware instead [1]. I'd rather have the junk where we
> > > can see it, reason about it and modify it.
> [...]
> > Is it the official position of the Linux kernel maintainers that they will
> > refuse to support future Arm standards in order to gate keep specific
> > hardware platforms?
>
> (just back from holiday; well, briefly, going away for a few days soon)
>
> We shouldn't generalise what maintainers wwould accept or not. We decide
> on a case by case basis. With speculative execution mitigations, for
> example, we try to do as much as we can in the kernel but sometimes
> that's just not possible, hence an EL3 call and we'd rather have this
> standardised (e.g. custom branch loops to flush the branch predictor if
> possible from the normal world, secure call if not).
>
> You mention PSCI but that's not working around broken hardware, it was a
> concious decision from the start to standardise the booting protocol and
> CPU power management.
>
> Now this PCI SMC protocol was simply created because hardware did not
> comply with another PCI standard that has been around for a long time.
> As with the speculative execution mitigations, we'd rather work around
> broken hardware in the kernel first and, if it's not possible, we can
> look at a firmware interface (and ideally standardised). Do you have an
> example where we cannot work around the PCI hardware bugs in the kernel
> and EL3 firmware involvement is necessary?
>
> So, in summary, Arm Ltd proposing a new standard because hardware
> companies can't be bothered with an existing one is not an argument for
> accepting its support in the Linux kernel. This PCI SMC conduit is not
> presented as a hardware bug workaround interface but rather as an
> alternative to ECAM (and, yes, the kernel maintainers can choose not to
> support specific "standards" in Linux).
Hello! I think that this PCI SMC could be already marked as deprecated
as Linux can use "native" drivers to access PCIe config space, without
need to use any kind of RPC mechanism, like ARM SMC.
Note that for example kernel driver phy-mvebu-a3700-comphy.c was
converted from ARM SMC API to true "native" linux driver which touch
hardware directly (and does not use RPC API). And this is the right
direction, stop using RPC APIs in kernel and configure hardware
directly without need to depends on firmware, SMC or any other SW which
is running on CPU. Depending on the firmware or its functionality which
access same HW as kernel itself, is always nightmare. x86 developers
have enough experience with BIOS and its poor implementations and there
was for a long time direction to not use x86 BIOS and rather communicate
with hardware directly. And if PCIe hardware is broken? Well, PCIe
controller drivers should be extended to handle or workaround it. I have
already sent lot of patches for Marvell PCIe controllers to workaround
HW design issues, so similarly it should be done for other (known
broken) vendor HW.
So in my opinion, instead of PCI SMC, kernel PCIe controller drivers
should be fixed to correctly access PCIe config space and completely
deprecate/remove this PCI SMC from kernel. And if PCI SMC has not landed
in kernel yet, even better, because deprecation step can be skipped.