RE: [PATCH] nvme/pci: Use host managed power state for suspend

From: Mario.Limonciello
Date: Mon May 13 2019 - 10:45:23 EST


> -----Original Message-----
> From: Christoph Hellwig <hch@xxxxxx>
> Sent: Monday, May 13, 2019 9:38 AM
> To: Limonciello, Mario
> Cc: keith.busch@xxxxxxxxx; hch@xxxxxx; sagi@xxxxxxxxxxx; linux-
> nvme@xxxxxxxxxxxxxxxxxxx; rafael@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> pm@xxxxxxxxxxxxxxx; kai.heng.feng@xxxxxxxxxxxxx
> Subject: Re: [PATCH] nvme/pci: Use host managed power state for suspend
>
>
> [EXTERNAL EMAIL]
>
> On Mon, May 13, 2019 at 02:24:41PM +0000, Mario.Limonciello@xxxxxxxx wrote:
> > I've received the result that from one of my partners this patch doesn't
> > work properly and the platform doesn't go into a lower power state.
>
> Well, it sounds like your partners device does not work properly in this
> case. There is nothing in the NVMe spec that says queues should be
> torn down for deep power states, and that whole idea seems rather
> counter productive to low-latency suspend/resume cycles.

Well I've got a thought, quoting the NVME spec:
"After a successful completion of a Set Features command for this feature, the controller shall be in the
Power State specified. If enabled, autonomous power state transitions continue to occur from the new state."

If APST is enabled on this disk, what is to stop an autonomous reverse
transition from queue activity on the way down?

>
> > This was not a disk with HMB, but with regard to the HMB I believe it
> > needs to be removed during s0ix so that there isn't any mistake that SSD
> > thinks it can access HMB memory in s0ix.
>
> There is no mistake - the device is allowed to use the HMB from the
> point that we give it the memory range until the point where we either
> disable it, or shut the controller down. If something else requires the
> device not to use the HMB after ->suspend is called we need to disable
> the HMB, and we better have a good reason for that and document it in
> the code. Note that shutting down queues or having CPU memory barriers
> is not going to help with any of that.

So maybe the CPU memory barriers were probably just working around it in terms of
timing.