Re: [PATCH 4/5] PCI: wait device ready after pci_pm_reset()

From: Sinan Kaya
Date: Thu Oct 12 2017 - 12:49:05 EST


On 10/11/2017 6:06 PM, Bjorn Helgaas wrote:
>> static int pci_pm_reset(struct pci_dev *dev, int probe)
>> {
>> + unsigned int delay = dev->d3_delay;
>> u16 csr;
>>
>> if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
>> @@ -3988,7 +3989,10 @@ static int pci_pm_reset(struct pci_dev *dev, int probe)
>> pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr);
>> pci_dev_d3_sleep(dev);
>>
>> - return 0;
>> + if (delay < pci_pm_d3_delay)
>> + delay = pci_pm_d3_delay;
>> +
>> + return pci_dev_wait(dev, "PM D3->D0", delay, 1000);
> 1) Why do we wait up to 1 second here, when we wait up to 60 seconds
> for the other methods? Can they all be the same? Maybe a #define for
> it?

I know you want to have similar behavior for systems that do and do not support
CRS. That was the reason why I converted flr wait function to into dev_wait function.

However, here is the problem:

For systems that do not support CRS, there is no way of knowing whether we
are reading 0xFFFFFFFF because the endpoint is not reachable due to an error
like "it doesn't support this reset type" or if it is actually emitting a CRS.

If one system has a problem with pm_reset, this code would add an unnecessary
1 second delay into the reset path. If I make it 60 it would be something like:

1. try reset method A
2. wait 60 seconds
3. try reset method B
4. wait 60 seconds.
5. try reset method C
6. wait 60 seconds

This might end up being a regression on some system.

I'm still leaning towards a wait only if we are observing a CRS. What's your
thought on this?

then the sequence would be.

1. try reset method A
2. if CRS pending, wait 60 seconds
3. try reset method B
4. if CRS pending, wait 60 seconds.
5. try reset method C
6. if CRS pending, wait 60 seconds

>
> 2) I don't really like the fact that we do the initial sleep one place
> and then pass the length of that sleep here. It's hard to verify
> they're the same and keep them in sync. I think the only thing you
> use initial_wait for is to include that time in the dmesg messages.
> Maybe we should just omit that time from the message and drop the
> parameter?
>

This was for printing reasons like you spotted, I can certainly get rid of
the initial_wait.

--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.