On Thu, Sep 22, 2022 at 09:09:28PM +0530, Krishna Chaitanya Chundru wrote:Not all PCI devices some PCI devices like NVMe. The NVMe driver is expecting the device to stay in D0 only.
On 9/21/2022 10:26 PM, Bjorn Helgaas wrote:I can't quite parse this. Are you saying that all PCI devices should
[+cc Rafael, linux-pm since this is real power management magic,Not only that NVMe is expecting the device state to be always in D0.
beginning of thread:
https://lore.kernel.org/all/1663669347-29308-1-git-send-email-quic_krichai@xxxxxxxxxxx/
full patch since I trimmed too much of it:
https://lore.kernel.org/all/1663669347-29308-2-git-send-email-quic_krichai@xxxxxxxxxxx/]
On Wed, Sep 21, 2022 at 03:23:35PM +0530, Krishna Chaitanya Chundru wrote:
On 9/20/2022 11:46 PM, Bjorn Helgaas wrote:The problem is that this commit log doesn't explain the problem and
On Tue, Sep 20, 2022 at 03:52:23PM +0530, Krishna chaitanya chundru wrote:Existing NVMe driver expecting NVMe device to be in D0 during S3 also. If we
Add suspend and resume syscore ops.What does this have to do with the patch? I don't see any NVMe or
Few PCIe endpoints like NVMe and WLANs are always expecting the device
to be in D0 state and the link to be active (or in l1ss) all the time
(including in S3 state).
WLAN patches here.
turn off the link in
suspend, the NVMe resume path is broken as the state machine is getting
reset in the NVMe device.
Due to this, the host driver state machine and the device state machine are
going out of sync, and all NVMe commands
after resumes are getting timed out.
IIRC, Tegra is also facing this issue with NVMe.
This issue has been discussed below threads:
https://lore.kernel.org/all/Yl+6V3pWuyRYuVV8@xxxxxxxxxxxxx/T/
https://lore.kernel.org/linux-nvme/20220201165006.3074615-1-kbusch@xxxxxxxxxx/
doesn't give us anything to connect the NVMe and WLAN assumptions with
this special driver behavior. There needs to be some explicit
property of NVMe and WLAN that the PM core or drivers like qcom can
use to tell whether the clocks can be turned off.
So any PCIe drivers should not turn off the link in suspend and do
link retraining in the resume. As this is considered a power cycle
by the NVMe device and eventually increases the wear of the NVMe
flash.
stay in D0 when the system is in S3?
Yes, it is the NVMe driver who is deciding to keep the device in D0. Our QCOM
We are trying to keep the device in D0 and also reduce the powerThe decision to keep a device in D0 is not up to qcom or any other PCI
consumption when the system is in S3 by turning off clocks and phy
with this patch series.
controller driver.
I don't think after the link enters into L1.x there will some activity on the link as you mentioned,
I don't think we can assume that nothing will happen to cause exitAfter the link enters the L1.x it will come out only if there isThis doesn't address my question. L1.x is an ASPM feature, whichThis is a Qcom PHY-specific feature (retaining the link state inIn qcom platform PCIe resources( clocks, phy etc..) canWhat's the connection with L1.x? Links enter L1.x based on
released when the link is in L1ss to reduce the power
consumption. So if the link is in L1ss, release the PCIe
resources. And when the system resumes, enable the PCIe
resources if they released in the suspend path.
activity and timing. That doesn't seem like a reliable
indicator to turn PHYs off and disable clocks.
L1.x with clocks turned off). It is possible only with the link
being in l1.x. PHY can't retain the link state in L0 with the
clocks turned off and we need to re-train the link if it's in L2
or L3. So we can support this feature only with L1.x. That is
the reason we are taking l1.x as the trigger to turn off clocks
(in only suspend path).
means hardware may enter or leave L1.x autonomously at any time
without software intervention. Therefore, I don't think reading the
current state is a reliable way to decide anything.
some activity on the link. AS system is suspended and NVMe driver
is also suspended( queues will freeze in suspend) who else can
initiate any data.
from L1.x. For instance, PCIe Messages for INTx signaling, LTR, OBFF,
PTM, etc., may be sent even though we think the device is idle and
there should be no link activity.
Bjorn