Re: [PATCH v2 1/2] PCI: Disable D3cold on Asus B1400 PCI-NVMe bridge

From: Jian-Hong Pan
Date: Wed Feb 07 2024 - 04:06:50 EST


Daniel Drake <drake@xxxxxxxxxxxxx> 於 2024年2月7日 週三 下午4:44寫道:
>
> The Asus B1400 with original shipped firmware versions and VMD disabled
> cannot resume from suspend: the NVMe device becomes unresponsive and
> inaccessible.
>
> This is because the NVMe device and parent PCI bridge get put into D3cold
> during suspend, and this PCI bridge cannot be recovered from D3cold mode:
>
> echo "0000:01:00.0" > /sys/bus/pci/drivers/nvme/unbind
> echo "0000:00:06.0" > /sys/bus/pci/drivers/pcieport/unbind
> setpci -s 00:06.0 CAP_PM+4.b=03 # D3hot
> acpidbg -b "execute \_SB.PC00.PEG0.PXP._OFF"
> acpidbg -b "execute \_SB.PC00.PEG0.PXP._ON"
> setpci -s 00:06.0 CAP_PM+4.b=0 # D0
> echo "0000:00:06.0" > /sys/bus/pci/drivers/pcieport/bind
> echo "0000:01:00.0" > /sys/bus/pci/drivers/nvme/bind
> # NVMe probe fails here with -ENODEV
>
> This appears to be an untested D3cold transition by the vendor; Intel
> socwatch shows that Windows leaves the NVMe device and parent bridge in D0
> during suspend, even though these firmware versions have StorageD3Enable=1.
>
> Experimenting with the DSDT, the _OFF method calls DL23() which sets a L23E
> bit at offset 0xe2 into the PCI configuration space for this root port.
> This is the specific write that the _ON routine is unable to recover from.
> This register is not documented in the public chipset datasheet.
>
> Disallow D3cold on the PCI bridge to enable successful suspend/resume.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215742
> Signed-off-by: Daniel Drake <drake@xxxxxxxxxxxxx>

Signed-off-by: Jian-Hong Pan <jhp@xxxxxxxxxxxxx>

> ---
> arch/x86/pci/fixup.c | 45 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 45 insertions(+)
>
> v2:
> Match only specific BIOS versions where this quirk is required.
> Add subsequent patch to this series to revert the original S3 workaround
> now that s2idle is usable again.
>
> diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> index f347c20247d30..6b0b341178e4f 100644
> --- a/arch/x86/pci/fixup.c
> +++ b/arch/x86/pci/fixup.c
> @@ -907,6 +907,51 @@ static void chromeos_fixup_apl_pci_l1ss_capability(struct pci_dev *dev)
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_save_apl_pci_l1ss_capability);
> DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_fixup_apl_pci_l1ss_capability);
>
> +/*
> + * Disable D3cold on Asus B1400 PCIe bridge at 00:06.0.
> + *
> + * On this platform with VMD off, the NVMe's parent PCI bridge cannot
> + * successfully power back on from D3cold, resulting in unresponsive NVMe on
> + * resume. This appears to be an untested transition by the vendor: Windows
> + * leaves the NVMe and parent bridge in D0 during suspend.
> + * This is only needed on BIOS versions before 308; the newer versions flip
> + * StorageD3Enable from 1 to 0.
> + */
> +static const struct dmi_system_id asus_nvme_broken_d3cold_table[] = {
> + {
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> + DMI_MATCH(DMI_BIOS_VERSION, "B1400CEAE.304"),
> + },
> + },
> + {
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> + DMI_MATCH(DMI_BIOS_VERSION, "B1400CEAE.305"),
> + },
> + },
> + {
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> + DMI_MATCH(DMI_BIOS_VERSION, "B1400CEAE.306"),
> + },
> + },
> + {
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> + DMI_MATCH(DMI_BIOS_VERSION, "B1400CEAE.307"),
> + },
> + },
> + {}
> +};
> +
> +static void asus_disable_nvme_d3cold(struct pci_dev *pdev)
> +{
> + if (dmi_check_system(asus_nvme_broken_d3cold_table) > 0)
> + pci_d3cold_disable(pdev);
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x9a09, asus_disable_nvme_d3cold);
> +
> #ifdef CONFIG_SUSPEND
> /*
> * Root Ports on some AMD SoCs advertise PME_Support for D3hot and D3cold, but
> --
> 2.43.0
>