Re: [PATCH v3] PCI: keystone: Fix race condition when initializing PHYs
From: Bjorn Helgaas
Date: Tue Jan 09 2024 - 16:23:39 EST
On Wed, Sep 27, 2023 at 09:48:45AM +0530, Siddharth Vadapalli wrote:
> The PCI driver invokes the PHY APIs using the ks_pcie_enable_phy()
> function. The PHY in this case is the Serdes. It is possible that the
> PCI instance is configured for 2 lane operation across two different
> Serdes instances, using 1 lane of each Serdes. In such a configuration,
> if the reference clock for one Serdes is provided by the other Serdes,
> it results in a race condition. After the Serdes providing the reference
> clock is initialized by the PCI driver by invoking its PHY APIs, it is
> not guaranteed that this Serdes remains powered on long enough for the
> PHY APIs based initialization of the dependent Serdes. In such cases,
> the PLL of the dependent Serdes fails to lock due to the absence of the
> reference clock from the former Serdes which has been powered off by the
> PM Core.
>
> Fix this by obtaining reference to the PHYs before invoking the PHY
> initialization APIs and releasing reference after the initialization is
> complete.
>
> Fixes: 49229238ab47 ("PCI: keystone: Cleanup PHY handling")
> Signed-off-by: Siddharth Vadapalli <s-vadapalli@xxxxxx>
> ---
>
> NOTE: This patch is based on linux-next tagged next-20230927.
>
> v2:
> https://lore.kernel.org/r/20230926063638.1005124-1-s-vadapalli@xxxxxx/
>
> Changes since v2:
> - Implement suggestion by Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> moving the phy_pm_runtime_put_sync() For-Loop section before the
> return value of ks_pcie_enable_phy(ks_pcie) is checked, thereby
> preventing duplication of the For-Loop.
> - Rebase patch on next-20230927.
>
> v1:
> https://lore.kernel.org/r/20230926054200.963803-1-s-vadapalli@xxxxxx/
>
> Changes since v1:
> - Add code to release reference(s) to the phy(s) when
> ks_pcie_enable_phy(ks_pcie) fails.
>
> Regards,
> Siddharth.
>
> drivers/pci/controller/dwc/pci-keystone.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
> index 49aea6ce3e87..0ec6720cc2df 100644
> --- a/drivers/pci/controller/dwc/pci-keystone.c
> +++ b/drivers/pci/controller/dwc/pci-keystone.c
> @@ -1218,7 +1218,16 @@ static int __init ks_pcie_probe(struct platform_device *pdev)
> goto err_link;
> }
>
> + /* Obtain reference(s) to the phy(s) */
> + for (i = 0; i < num_lanes; i++)
> + phy_pm_runtime_get_sync(ks_pcie->phy[i]);
> +
> ret = ks_pcie_enable_phy(ks_pcie);
> +
> + /* Release reference(s) to the phy(s) */
> + for (i = 0; i < num_lanes; i++)
> + phy_pm_runtime_put_sync(ks_pcie->phy[i]);
This looks good and has already been applied, so no immediate action
required.
This is the only call to ks_pcie_enable_phy(), and these loops get and
put the PM references for the same PHYs initialized in
ks_pcie_enable_phy(), so it seems like maybe these loops could be
moved *into* ks_pcie_enable_phy().
Is there any similar issue in ks_pcie_disable_phy()? What if we
power-off a PHY that provides a reference clock to other PHYs that are
still powered-up? Will the dependent PHYs still power-off cleanly?
> if (ret) {
> dev_err(dev, "failed to enable phy\n");
> goto err_link;
> --
> 2.34.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel