Re: [PATCH v2] PCI: dwc: Wait for link up only if link is started

From: Ajay Agarwal
Date: Wed Apr 05 2023 - 23:50:42 EST


On Wed, Apr 05, 2023 at 12:23:47PM -0700, William McVicker wrote:
> On 04/05/2023, William McVicker wrote:
> > On 04/05/2023, Bjorn Helgaas wrote:
> > > On Wed, Apr 05, 2023 at 03:24:36PM +0200, Lorenzo Pieralisi wrote:
> > > > On Thu, Mar 16, 2023 at 06:05:02PM -0500, Sajid Dalvi wrote:
> > > > > On Tue, Feb 28, 2023 at 10:36 PM Sajid Dalvi <sdalvi@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > Thanks for your review Jingoo.
> > > > > > Sajid
> > > > > >
> > > > > > On Tue, Feb 28, 2023 at 4:04 PM Han Jingoo <jingoohan1@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On Mon, Feb 27, 2023, Sajid Dalvi <sdalvi@xxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > In dw_pcie_host_init() regardless of whether the link has been started
> > > > > > > > or not, the code waits for the link to come up. Even in cases where
> > > > > > > > start_link() is not defined the code ends up spinning in a loop for 1
> > > > > > > > second. Since in some systems dw_pcie_host_init() gets called during
> > > > > > > > probe, this one second loop for each pcie interface instance ends up
> > > > > > > > extending the boot time.
> > > > > > > >
> > > > > > > > Call trace when start_link() is not defined:
> > > > > > > > dw_pcie_wait_for_link << spins in a loop for 1 second
> > > > > > > > dw_pcie_host_init
> > > > > > > >
> > > > > > > > Signed-off-by: Sajid Dalvi <sdalvi@xxxxxxxxxx>
> > > > > > >
> > > > > > > (CC'ed Krzysztof Kozlowski)
> > > > > > >
> > > > > > > Acked-by: Jingoo Han <jingoohan1@xxxxxxxxx>
> > > > > > >
> > > > > > > It looks good to me. I also checked the previous thread.
> > > > > > > I agree with Krzysztof's opinion that we should include
> > > > > > > only hardware-related features into DT.
> > > > > > > Thank you.
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Jingoo Han
> > > > > > >
> > > > > > > > ---
> > > > > > > > drivers/pci/controller/dwc/pcie-designware-host.c | 6 +++---
> > > > > > > > 1 file changed, 3 insertions(+), 3 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > > > > > index 9952057c8819..9709f69f173e 100644
> > > > > > > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > > > > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > > > > > @@ -489,10 +489,10 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
> > > > > > > > ret = dw_pcie_start_link(pci);
> > > > > > > > if (ret)
> > > > > > > > goto err_remove_edma;
> > > > > > > > - }
> > > > > > > >
> > > > > > > > - /* Ignore errors, the link may come up later */
> > > > > > > > - dw_pcie_wait_for_link(pci);
> > > > > > > > + /* Ignore errors, the link may come up later */
> > > > > > > > + dw_pcie_wait_for_link(pci);
> > > > > > > > + }
> > > > > > > >
> > > > > > > > bridge->sysdata = pp;
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.39.2.722.g9855ee24e9-goog
> > > > > > > >
> > > > >
> > > > > @bhelgaas Can this be picked up in your tree:
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/
> > > >
> > > > This patch seems fine to me. The question I have though is why the
> > > > *current* code is written the way it is. Perhaps it is just the way
> > > > it is, I wonder whether this change can trigger a regression though.
> > >
> > > The new code will look basically like this:
> > >
> > > if (!dw_pcie_link_up(pci)) {
> > > dw_pcie_start_link(pci);
> > > dw_pcie_wait_for_link(pci);
> > > }
> > >
> > > If the link is already up by the time we get here, this change means
> > > we won't get this message emitted by dw_pcie_wait_for_link():
> > >
> > > dev_info(pci->dev, "PCIe Gen.%u x%u link up\n", ...)
> > >
> > > I don't know how important that is, but I bet somebody cares about it.
> > >
> > > From the commit log, I expected the patch to do something based on
> > > whether ->start_link() was defined, but there really isn't a direct
> > > connection, so maybe the log could be refined.
> > >
> > > Bjorn
> > >
> > > --
> > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx.
> > >
> >
> > After taking a deeper dive into this patch, I found that [1] changes the
> > original intent which was to skip the call to dw_pcie_wait_for_link()
> > when pci->ops->start_link is NULL. I talked to Sajid offline and he
> > agreed we should put back the start_link NULL check. The updated patch
> > should look like this:
> >
> > if (!dw_pcie_link_up(pci) && pci->ops && pci->ops->start_link) {
> > ret = dw_pcie_start_link(pci);
> > if (ret)
> > goto err_free_msi;
> > dw_pcie_wait_for_link(pci);
> > }
> >
> >
> > ...which will ensure that we don't call dw_pcie_wait_for_link() when
> > pci->ops->start_link is NULL.
> >
> > With regards to the log, I think there are 2 ways to solve this:
> >
> > 1) We could also call dw_pcie_wait_for_link() in a new else if
> > dw_pcie_link_up() returns 1.
> > 2) We could add this to the top of dw_pcie_wait_for_link() and leave the
> > code as is:
> >
> > if (!pci->ops || !pci->ops->start_link)
> > return 0;
> >
> > I kind of like (2) since that solves both Sajid's original issue and
> > will keep the original log.
> >
> > [1] https://lore.kernel.org/all/20220624143428.8334-14-Sergey.Semin@xxxxxxxxxxxxxxxxxxxx/
> >
> > Regards,
> > Will
>
> Below is what I'm thinking will do the job. I verified on a Pixel 6
> (which doesn't have start_link() defined) that we don't have the 1
> second wait from dw_pcie_wait_for_link() during probe.
>
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index 8e33e6e59e68..1bf04324ad2d 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -648,13 +648,16 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
> {
> u32 offset, val;
> int retries;
> + int link_up = dw_pcie_link_up(pci);
>
> - /* Check if the link is up or not */
> - for (retries = 0; retries < LINK_WAIT_MAX_RETRIES; retries++) {
> - if (dw_pcie_link_up(pci))
> - break;
> + if (!link_up && !(pci->ops && pci->ops->start_link))
> + return 0;
There is a problem with this approach. A platform driver could enable
link training internally, i.e., it does not have the start_link() pointer
defined. Then it could call `dw_pcie_wait_for_link` to wait for the link
to come up. (See pcie-intel-gw.c for an example of such a platform).
Your logic will end up regressing this driver by exiting early.
>
> + /* Check if the link is up or not */
> + for (retries = 0; !link_up && retries < LINK_WAIT_MAX_RETRIES; retries++) {
> usleep_range(LINK_WAIT_USLEEP_MIN, LINK_WAIT_USLEEP_MAX);
> +
> + link_up = dw_pcie_link_up(pci);
> }
>
> if (retries >= LINK_WAIT_MAX_RETRIES) {
>
The problem of the log is still not solved for a platform which could
have the link up by default, i.e., it does not need to explicitly enable
link training.