Re: [PATCH pci] PCI: don't skip probing entire device if first fn OF node has status = "disabled"

From: Bjorn Helgaas
Date: Tue May 30 2023 - 18:27:31 EST

On Wed, May 31, 2023 at 01:04:36AM +0300, Vladimir Oltean wrote:
> On Tue, May 30, 2023 at 04:58:55PM -0500, Bjorn Helgaas wrote:
> > Can you write this description in terms of PCI topology? The
> > nitty-gritty SERDES details are not relevant at this level, except to
> > say that Function 0 is present in some cases but not others, and when
> > it is not present, *other* functions may be present.
> No. It is to say that within the device, all PCIe functions (including 0)
> are always available and have the same number, but depending on SERDES
> configuration, their PCIe presence might be practically useful or not.
> So that's how function 0 may end having status = "disabled" in the
> device tree.
> > Sigh. Per spec (PCIe r6.0, sec, software is not permitted
> > to probe for Functions other than 0 unless "explicitly indicated by
> > another mechanism, such as an ARI or SR-IOV Capability."
> >
> > Does it "work" to probe when the spec prohibits it? Probably. Does
> > it lead to some breakage elsewhere eventually? Quite possibly. They
> > didn't put "software must not probe" in the spec just to make
> > enumeration faster.
> >
> > So I'm a little grumpy about further complicating this already messy
> > path just to accommodate a new non-compliant SoC. Everybody pays the
> > price of understanding all this stuff, and it doesn't seem in balance.
> >
> > Can you take advantage of some existing mechanism like
> > PCI_SCAN_ALL_PCIE_DEVS or hypervisor_isolated_pci_functions() (which
> > could be renamed and made more general)?
> Not responding yet to the rest of the email since it's not clear to me
> that you've understood function 0 is absolutely present and responds
> to all config space accesses - it's just disabled in the device tree
> because the user doesn't have something useful to do with it.

Ah, you're right, sorry I missed that. Dispensing with the SERDES
details would make this more obvious.

Not sure why this needs to change the pci_scan_slot() path, since
Function 0 is present and enumerable even though it's not useful in
some cases. Seems like something in pci_set_of_node() or a quirk
could do whatever you need to do.