Re: [PATCH pci] PCI: don't skip probing entire device if first fn OF node has status = "disabled"

From: Vladimir Oltean
Date: Mon May 29 2023 - 16:48:21 EST


Hi Bjorn,

On Sun, May 21, 2023 at 02:51:41PM +0300, Vladimir Oltean wrote:
> pci_scan_child_bus_extend() calls pci_scan_slot() with devfn
> (bus:device:function) being a multiple of 8, i.e. for each unique
> device.
>
> pci_scan_slot() has logic to say that if the function 0 of a device is
> absent, the entire device is absent and we can skip the other functions
> entirely. Traditionally, this has meant that pci_bus_read_dev_vendor_id()
> returns an error code for that function.
>
> However, since the blamed commit, there is an extra confounding
> condition: function 0 of the device exists and has a valid vendor id,
> but it is disabled in the device tree. In that case, pci_scan_slot()
> would incorrectly skip the entire device instead of just that function.
>
> Such is the case with the NXP LS1028A SoC, which has an ECAM
> for embedded Ethernet (see pcie@1f0000000 in
> arm64/boot/dts/freescale/fsl-ls1028a.dtsi). Each Ethernet port
> represents a function within the ENETC ECAM, with function 0 going
> to ENETC Ethernet port 0, connected to SERDES port 0 (SGMII or USXGMII).
>
> When using a SERDES protocol such as 0x9999, all 4 SERDES lanes go to
> the Ethernet switch (function 5 on this ECAM) and none go to ENETC
> port 0. So, ENETC port 0 needs to have status = "disabled", and embedded
> Ethernet takes place just through the other functions (fn 2 is the DSA
> master, fn 3 is the MDIO controller, fn 5 is the DSA switch etc).
> Contrast this with other SERDES protocols like 0x85bb, where the switch
> takes up a single SERDES lane and uses the QSGMII protocol - so ENETC
> port 0 also gets access to a SERDES lane.
>
> Therefore, here, function 0 being unused has nothing to do with the
> entire PCI device being unused.
>
> Add a "bool present_but_skipped" which is propagated from the caller
> of pci_set_of_node() all the way to pci_scan_slot(), so that it can
> distinguish an error reading the ECAM from a disabled device in the
> device tree.
>
> Fixes: 6fffbc7ae137 ("PCI: Honor firmware's device disabled status")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@xxxxxxx>
> ---

Do you have some comments here?

Thanks,
Vladimir