Re: [PATCH pci] PCI: don't skip probing entire device if first fn OF node has status = "disabled"
From: Vladimir Oltean
Date: Thu Jun 01 2023 - 18:16:05 EST
On Thu, Jun 01, 2023 at 12:51:39PM -0500, Bjorn Helgaas wrote:
> > > Doing it in Linux would minimize dependences on the bootloader, so
> > > that seems desirable to me. That means Linux needs to enumerate
> > > Function 0 so it is visible to a driver or possibly a quirk.
> >
> > Uhm... no, that wouldn't be enough. Only a straight revert would satisfy
> > the workaround that we currently have for NXP ENETC in Linux.
>
> I guess you mean a revert of 6fffbc7ae137?
Yes.
> This whole conversation is about whether we can rework 6fffbc7ae137 to
> work both for Loongson and for you, so nothing is decided yet.
After reading
https://lore.kernel.org/linux-pci/20221117020935.32086-1-liupeibao@xxxxxxxxxxx/
and
https://lore.kernel.org/linux-pci/20221103090040.836-1-liupeibao@xxxxxxxxxxx/
and seeing the GMAC OF node at arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi,
I believe that a solution that would work for both Loongson and NXP would be to:
- patch loongson_dwmac_probe() to check for of_device_is_available()
- revert commit 6fffbc7ae137 ("PCI: Honor firmware's device disabled
status")
I'm not sure what else of what was concretely proposed would work.
Anything else is just wishful thinking that the PCI core can start
enforcing a central policy, after letting device drivers get to choose
how (and whether) to treat the "status" OF property for years on end.
As an added benefit, the disabled GMAC would become visible in lspci for
the Loongson SoC.
> The point is, I assume you agree that it's preferable if we don't have
> to depend on a bootloader to clear the memory.
I am confused by the message you are transmitting here.
With my user hat on, yes, maintaining the effect of commit 3222b5b613db
from Linux is preferable.
Although Rob will probably not be happy about the way in which that will
be achieved. And you haven't proposed ways in which that would remain
possible, short of a revert of commit 6fffbc7ae137.
> After 6fffbc7ae137, the probe function is not called if the device is
> disabled in DT because there's no pci_dev for it at all.
Correct, but commit 3222b5b613db pre-dates it by 2 years, and thus, it
is broken by Rob's change.
> > My problem is that I don't really understand what was the functional
> > need for commit 6fffbc7ae137 ("PCI: Honor firmware's device disabled
> > status") in the first place, considering that any device driver can
> > already fail to probe based on the same condition at its own will.
>
> In general, PCI drivers shouldn't rely on DT. If the bus driver (PCI
> in this case) calls a driver's probe function, the driver can assume
> the device exists.
Well, the device exists...
> But enetc is not a general-purpose driver, and if DT is the only way
> to discover this property, I guess you're stuck doing that.
So what Loongson tried to do - break enumeration of the on-chip GMAC
PCIe device at the level of the PCIe controller, if the GMAC's pinmuxing
doesn't make it available for networking - is encouraged?
Do you consider that their patch would have been better in the original
form, if instead of the "skip-scan" property, they would have built some
smarts into drivers/pci/controller/pci-loongson.c which would intentionally
break config space access to gmac@3,0, without requiring OF to specify this?
Are you saying that this "present but unusable due to pinmuxing" is an
incorrect use of status = "disabled"? What would it constitute correct
use of, then?
The analogous situation for ENETC would be to patch the "pci-host-ecam-generic"
driver to read the SERDES and pinmuxing configuration of the SoC, and to
mask/unmask the config access to function 0 based on that. I mean - I could...
but is it really a good idea? The principle of separation of concerns
tells me no. The fact that the pinmuxing of the device makes it unavailable
pertains to the IP-specific logic, it doesn't change whether it's enumerable
or accessible on its bus.
> > > Is DT the only way to learn the NXP SERDES configuration? I think it
> > > would be much better if there were a way to programmatically learn it,
> > > because then you wouldn't have to worry about syncing the DT with the
> > > platform configuration, and it would decouple this from the Loongson
> > > situation.
> >
> > Syncing the DT with the platform configuration will always be necessary,
> > because for networking we will also need extra information which is
> > completely non-discoverable, like a phy-handle or such, and that depends
> > on the wiring and static pinmuxing of the SoC. So it is practically
> > reasonable to expect that what is usable has status = "okay", and what
> > isn't has status = "disabled". Not to mention, there are already device
> > trees in circulation which are written that way, and those need to
> > continue to work.
>
> Just because we need DT for non-discoverable info A doesn't mean we
> should depend on it for B if B *is* discoverable.
But the argument was: we already have device trees with a certain
convention, and that is to expect having status = "disabled" for
unusable ports. I don't believe that changing that is realistically in
scope for fixing this. And if we have device trees with status =
"disabled" in circulation which we (I) don't want to break, then we're
back to square 1 regarding the probing of disabled devices.
> This question of disabling a device via DT but still needing to do
> things to the device is ... kind of a sticky wicket.
It boils down to whether accessing a disabled device is permitted or
not. I opened the devicetree specification and it didn't say anything
conclusive. Though it's certainly above my pay grade to say anything
with certainty in this area. Apart from "okay" and "disabled", "status"
takes other documented values too, like "reserved", "fail" and
"fail-sss". Linux treats everything that's not "okay" the same.
Krzysztof Kozlowski came with the suggestion for Loongson to replace
"skip-scan" with "status", during the review of their v1 patch.
In any case, that question will only recur one level lower - in U-Boot,
where we make an effort to keep device trees in sync in Linux. Why would
U-Boot need to do things to a disabled device? :)
> Maybe this should be a different DT property (not "status"). Then PCI
> enumeration could work normally and 6fffbc7ae137 wouldn't be in the
> way.
I'm not quite sure where you're going with this. More concretely?