RE: [Regression] Commit "nvme/pci: Use host managed power state for suspend" has problems

From: Mario.Limonciello
Date: Tue Aug 06 2019 - 10:02:29 EST




> -----Original Message-----
> From: Rafael J. Wysocki <rafael@xxxxxxxxxx>
> Sent: Monday, August 5, 2019 4:29 PM
> To: Kai-Heng Feng
> Cc: Rafael J. Wysocki; Limonciello, Mario; Keith Busch; Keith Busch; Christoph
> Hellwig; Sagi Grimberg; linux-nvme; Linux PM; Linux Kernel Mailing List; Rajat Jain
> Subject: Re: [Regression] Commit "nvme/pci: Use host managed power state for
> suspend" has problems
>
>
> [EXTERNAL EMAIL]
>
> On Mon, Aug 5, 2019 at 9:14 PM Kai-Heng Feng
> <kai.heng.feng@xxxxxxxxxxxxx> wrote:
> >
> > at 19:04, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> >
> > > On Fri, Aug 2, 2019 at 12:55 PM Kai-Heng Feng
> > > <kai.heng.feng@xxxxxxxxxxxxx> wrote:
> > >> at 06:26, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > >>
> > >>> On Thu, Aug 1, 2019 at 9:05 PM <Mario.Limonciello@xxxxxxxx> wrote:
> > >>>>> -----Original Message-----
> > >>>>> From: Rafael J. Wysocki <rafael@xxxxxxxxxx>
> > >>>>> Sent: Thursday, August 1, 2019 12:30 PM
> > >>>>> To: Kai-Heng Feng; Keith Busch; Limonciello, Mario
> > >>>>> Cc: Keith Busch; Christoph Hellwig; Sagi Grimberg; linux-nvme; Linux
> > >>>>> PM; Linux
> > >>>>> Kernel Mailing List; Rajat Jain
> > >>>>> Subject: Re: [Regression] Commit "nvme/pci: Use host managed power
> > >>>>> state for
> > >>>>> suspend" has problems
> > >>>>>
> > >>>>>
> > >>>>> [EXTERNAL EMAIL]
> > >>>>>
> > >>>>> On Thu, Aug 1, 2019 at 11:06 AM Kai-Heng Feng
> > >>>>> <kai.heng.feng@xxxxxxxxxxxxx> wrote:
> > >>>>>> at 06:33, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > >>>>>>
> > >>>>>>> On Thu, Aug 1, 2019 at 12:22 AM Keith Busch <kbusch@xxxxxxxxxx>
> > >>>>>>> wrote:
> > >>>>>>>> On Wed, Jul 31, 2019 at 11:25:51PM +0200, Rafael J. Wysocki wrote:
> > >>>>>>>>> A couple of remarks if you will.
> > >>>>>>>>>
> > >>>>>>>>> First, we don't know which case is the majority at this point. For
> > >>>>>>>>> now, there is one example of each, but it may very well turn out
> > >>>>>>>>> that
> > >>>>>>>>> the SK Hynix BC501 above needs to be quirked.
> > >>>>>>>>>
> > >>>>>>>>> Second, the reference here really is 5.2, so if there are any
> > >>>>>>>>> systems
> > >>>>>>>>> that are not better off with 5.3-rc than they were with 5.2,
> > >>>>>>>>> well, we
> > >>>>>>>>> have not made progress. However, if there are systems that are
> > >>>>>>>>> worse
> > >>>>>>>>> off with 5.3, that's bad. In the face of the latest findings the
> > >>>>>>>>> only
> > >>>>>>>>> way to avoid that is to be backwards compatible with 5.2 and that's
> > >>>>>>>>> where my patch is going. That cannot be achieved by quirking all
> > >>>>>>>>> cases that are reported as "bad", because there still may be
> > >>>>>>>>> unreported ones.
> > >>>>>>>>
> > >>>>>>>> I have to agree. I think your proposal may allow PCI D3cold,
> > >>>>>>>
> > >>>>>>> Yes, it may.
> > >>>>>>
> > >>>>>> Somehow the 9380 with Toshiba NVMe never hits SLP_S0 with or
> without
> > >>>>>> Rafaelâs patch.
> > >>>>>> But the ârealâ s2idle power consumption does improve with the patch.
> > >>>>>
> > >>>>> Do you mean this patch:
> > >>>>>
> > >>>>> https://lore.kernel.org/linux-pm/70D536BE-8DC7-4CA2-84A9-
> > >>>>>
> AFB067BA520E@xxxxxxxxxxxxx/T/#m456aa5c69973a3b68f2cdd4713a1ce83be5145
> > >>>>> 8f
> > >>>>>
> > >>>>> or the $subject one without the above?
> > >>>>>
> > >>>>>> Can we use a DMI based quirk for this platform? It seems like a
> > >>>>>> platform
> > >>>>>> specific issue.
> > >>>>>
> > >>>>> We seem to see too many "platform-specific issues" here. :-)
> > >>>>>
> > >>>>> To me, the status quo (ie. what we have in 5.3-rc2) is not defensible.
> > >>>>> Something needs to be done to improve the situation.
> > >>>>
> > >>>> Rafael, would it be possible to try popping out PC401 from the 9380 and
> > >>>> into a 9360 to
> > >>>> confirm there actually being a platform impact or not?
> > >>>
> > >>> Not really, sorry.
> > >>>
> > >>>> I was hoping to have something useful from Hynix by now before
> > >>>> responding, but oh well.
> > >>>>
> > >>>> In terms of what is the majority, I do know that between folks at Dell,
> > >>>> Google, Compal,
> > >>>> Wistron, Canonical, Micron, Hynix, Toshiba, LiteOn, and Western Digital
> > >>>> we tested a wide
> > >>>> variety of SSDs with this patch series. I would like to think that they
> > >>>> are representative of
> > >>>> what's being manufactured into machines now.
> > >>>
> > >>> Well, what about drives already in the field? My concern is mostly
> > >>> about those ones.
> > >>>
> > >>>> Notably the LiteOn CL1 was tested with the HMB flushing support and
> > >>>> and Hynix PC401 was tested with older firmware though.
> > >>>>
> > >>>>>>>> In which case we do need to reintroduce the HMB handling.
> > >>>>>>>
> > >>>>>>> Right.
> > >>>>>>
> > >>>>>> The patch alone doesnât break HMB Toshiba NVMe I tested. But I think
> > >>>>>> itâs
> > >>>>>> still safer to do proper HMB handling.
> > >>>>>
> > >>>>> Well, so can anyone please propose something specific? Like an
> > >>>>> alternative patch?
> > >>>>
> > >>>> This was proposed a few days ago:
> > >>>> http://lists.infradead.org/pipermail/linux-nvme/2019-July/026056.html
> > >>>>
> > >>>> However we're still not sure why it is needed, and it will take some
> > >>>> time to get
> > >>>> a proper failure analysis from LiteOn regarding the CL1.
> > >>>
> > >>> Thanks for the update, but IMO we still need to do something before
> > >>> final 5.3 while the investigation continues.
> > >>>
> > >>> Honestly, at this point I would vote for going back to the 5.2
> > >>> behavior at least by default and only running the new code on the
> > >>> drives known to require it (because they will block PC10 otherwise).
> > >>>
> > >>> Possibly (ideally) with an option for users who can't get beyond PC3
> > >>> to test whether or not the new code helps them.
> > >>
> > >> I just found out that the XPS 9380 at my hand never reaches SLP_S0 but
> > >> only
> > >> PC10.
> > >
> > > That's the case for me too.
> > >
> > >> This happens with or without putting the device to D3.
> > >
> > > On my system, though, it only can get to PC3 without putting the NVMe
> > > into D3 (as reported previously).
> >
> > I forgot to ask, what BIOS version does the system have?
> > I donât see this issue on BIOS v1.5.0.
>
> It is 1.5.0 here too.

All, regarding the need for the patch proposed by Rafael on PC401, I have some updates
to share from Hynix.
First off - the firmware changelog is misleading from 80006E00 to 80007E00.

The change was made in the firmware specifically because of a change in behavior from
Intel KBL to CFL and WHL. On CFL/WHL the period of time that RefClk was turned on after L1.2
was larger than KBL platforms. So this meant that Hynix couldn't lock up from CLKREQ#
to RefClk as quickly on CFL/WHL. So there is a "larger" fixed delay introduced in their FW.

To those that don't know - XPS 9380 is a WHL platform.

Second - a hypothesis of what is happening with the patch proposed by Rafael is that the link
is only transitioning to L1.0 rather than L1.2. This may satisfy the PMC but it shouldn't lead to
the lowest actual device power state.