Re: [PATCH v2 1/2] PCI: Disable D3cold on Asus B1400 PCI-NVMe bridge

From: Daniel Drake
Date: Mon Feb 19 2024 - 06:35:50 EST


On Fri, Feb 9, 2024 at 9:36 AM Daniel Drake <drake@xxxxxxxxxxxxx> wrote:
> On Thu, Feb 8, 2024 at 5:57 PM David E. Box <david.e.box@xxxxxxxxxxxxxxx> wrote:
> > This does look like a firmware bug. We've had reports of D3cold support missing
> > when running in non-VMD mode on systems that were designed with VMD for Windows.
> > These issues have been caught and addressed by OEMs during enabling of Linux
> > systems. Does D3cold work in VMD mode?
>
> On Windows for the VMD=on case, we only tested this on a BIOS with
> StorageD3Enable=0. The NVMe device and parent bridge stayed in D0 over
> suspend, but that's exactly what the BIOS asked for, so it doesn't
> really answer your question.

Tested on the original BIOS version with VMD=on: Windows leaves the
NVMe device (and parent bridge) in D0 during suspend (i.e. same result
as VMD=off).

On this setup, there are 2 devices with StorageD3Enable flags:

1. \_SB.PC00.PEG0.PEGP._DSD has StorageD3Enable=1. This is set
regardless of the VMD setting at the BIOS level. This is the flag that
is causing us the headache in non-VMD mode where Linux then proceeds
to put devices into D3cold.
This PEGP device in the non-VMD configuration corresponds to the NVMe
storage device. PEG0 is the PCI root port at 00:06.0 (the one in
question in this thread), and PEGP is the child with address 0.
However in VMD mode, 00:06.0 is a dummy device (not a bridge) so this
PEGP device isn't going to be used by anything.

2. \_SB.PC00.VMD0._DSD has StorageD3Enable=0. This VMD0 device is only
present when VMD is enabled in the BIOS. It is the companion for
00:0e.0 which is the device that the vmd driver binds against. This
could be influencing Windows to leave the NVMe device in D0, but I
doubt it, because it can't explain why Windows would have the D0
behaviour when VMD=off, also this is a really strange place to put the
StorageD3Enable setting because it is not a storage device.

> On Linux with VMD=on and StorageD3Enable=1, the NVMe storage device
> and the VMD parent bridge are staying in D0 over suspend. I don't know
> why this is, I would have expected at least D3hot. However, given
> that the NVMe device has no firmware_node under the VMD=on setup, I
> believe there is no way it would enter D3cold because there's no
> linkage to an ACPI device, so no available _PS3 or _PR0 or whatever is
> the precise definition of D3cold.

Checked in more detail. In Linux, the NVMe device will only go into
D3hot/D3cold if the ACPI companion device has an explicit
StorageD3Enable=1. However, in VMD mode the NVMe storage device has no
ACPI companion. Code flow is nvme_pci_alloc_dev() -> acpi_storage_d3()
-> return false because no companion.

The VMD PCI bridge at 10000:e0:06.0 that is parent of the SATA & NVME
devices does have a companion \_SB.PC00.VMD0.PEG0
However, the SATA and NVME child devices do not have any ACPI
companion. I examined the logic of vmd_acpi_find_companion() and
determined that it is looking for devices with _ADR 80b8ffff (SATA)
and 8100ffff (NVME) and such devices do not exist in the ACPI tables.

Speculating a little, I guess this is also why Windows leaves the
device in D0 in VMD=on mode: it would only put the NVMe device in
D3hot/D3cold if it had a corresponding companion with
StorageD3Enable=1 and there isn't one of those. What's still unknown
is why it doesn't put the device in D3 in VMD=off mode because there
is a correctly placed StorageD3Enable=1 in that case.

Daniel