RE: [PATCH] nvme-pci: Disable LTR for simple suspend

From: Limonciello, Mario
Date: Tue Mar 15 2022 - 11:58:34 EST


[Public]



> -----Original Message-----
> From: Christoph Hellwig <hch@xxxxxx>
> Sent: Tuesday, March 15, 2022 02:23
> To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>
> Cc: Keith Busch <kbusch@xxxxxxxxxx>; Jens Axboe <axboe@xxxxxx>;
> Christoph Hellwig <hch@xxxxxx>; Sagi Grimberg <sagi@xxxxxxxxxxx>; open
> list:NVM EXPRESS DRIVER <linux-nvme@xxxxxxxxxxxxxxxxxxx>; open list
> <linux-kernel@xxxxxxxxxxxxxxx>; Huang, Patrick <Patrick.Huang@xxxxxxx>;
> linux-pci@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] nvme-pci: Disable LTR for simple suspend
>
> On Mon, Mar 14, 2022 at 08:55:37AM -0500, Mario Limonciello wrote:
> > Some drives from SSSTC are showing stability problems after s0i3
> > entry when the Linux kernel is in s2idle loop if LTR has been
> > enabled. This leads to failures to resume.
> >
> > This appears to be a firmware issue specific to SSSTC SSDs, but to
> > avoid this class of problem, disable LTR when going into s2idle and
> > simple suspend has been set.
>
> This seems like a giant hammer to do this for all NVMe devices,
> why not quirk the specific ones?
>

The thought process was to be defensive against any other drives
that have this firmware deficiency as well, but point taken.

Will rework it as a quirk.

> > +static void nvme_suspend_ltr(struct device *dev, bool disable)
> > +{
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > + struct nvme_dev *ndev = pci_get_drvdata(pdev);
> > +
> > + if (disable) {
> > + u16 word;
> > +
> > + pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2,
> &word);
> > + ndev->restore_ltr = word & PCI_EXP_DEVCTL2_LTR_EN;
> > + pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
> > + PCI_EXP_DEVCTL2_LTR_EN);
> > + } else if (ndev->restore_ltr) {
> > + pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2,
> > + PCI_EXP_DEVCTL2_LTR_EN);
> > + }
> > +}
>
> The calling conventions of this function are rather strange by
> mixing up two very different things.
>
> I think two PCI-level helpers to disable LTR and return the status
> it ways in and to enable LTR would be really nice to have here.

If adding helper functions to PCI for this, it actually begs the argument
of whether this should just be a PCI quirk. The LTR behavior is tied
to how it behaves in D3, not based on anything that the NVME driver
has done.

Unless you feel otherwise, I'll rework it as a PCI quirk.

>
> > if (ndev->last_ps == U32_MAX ||
> > nvme_set_power_state(ctrl, ndev->last_ps) != 0)
> > goto reset;
> > @@ -3239,6 +3259,11 @@ static int nvme_suspend(struct device *dev)
> >
> > ndev->last_ps = U32_MAX;
> >
> > + /* If using s2idle with simple suspend, disable LTR to avoid problems.
> */
>
> Overly long line here.