Re: [PATCH v3 1/2] PCI: Add D3cold as general reset method
From: Alex Williamson
Date: Wed May 13 2026 - 09:59:00 EST
On Wed, May 13, 2026, at 6:23 AM, Jose Ignacio Tornos Martinez wrote:
> Add D3cold power cycle as a general PCI reset method, available for
> single-function devices. This provides a more robust reset mechanism
> than D3hot for devices where a full power cycle is beneficial.
>
> The implementation uses pci_set_power_state(dev, PCI_D3cold), which
> automatically handles platform differences:
> - Platforms WITH _PR3 ACPI power resources: true D3cold (power cycle)
> - Platforms WITHOUT _PR3: automatic fallback to D3hot transition
>
> D3cold reset is placed at the end of the reset hierarchy as a last
> resort before giving up, since it provides a strong reset when other
> methods are unavailable or broken.
>
> Reset hierarchy with this change:
> 1. device_specific
> 2. acpi
> 3. flr
> 4. af_flr
> 5. pm (D3hot via config space)
> 6. bus (SBR)
> 7. cxl_bus
> 8. d3cold (NEW - power cycle with D3hot fallback)
>
> This benefits devices that:
> - Have broken or missing FLR
> - Advertise NoSoftRst+ incorrectly (blocking D3hot PM reset)
> - Have broken bus reset implementations
> - Need power cycle for reliable reset
> - Are used in VFIO passthrough scenarios
>
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@xxxxxxxxxx>
> ---
> v3: Implement d3cold as a general PCI core reset method instead of
> device-specific quirk approach from v2 (Alex Williamson suggestion)
> v2: https://lore.kernel.org/all/20260508145153.717641-2-jtornosm@xxxxxxxxxx/
>
> drivers/pci/pci.c | 41 +++++++++++++++++++++++++++++++++++++++++
> include/linux/pci.h | 2 +-
> 2 files changed, 42 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8f7cfcc00090..6da8feff7ccc 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4491,6 +4491,46 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
> return ret;
> }
>
> +/**
> + * pci_d3cold_reset - Put device into D3cold (or D3hot) and back to D0
> for reset
> + * @dev: PCI device to reset
> + * @probe: if true, check if D3cold reset is supported; if false,
> perform reset
> + *
> + * Attempt to reset the device by transitioning through D3cold and
> back to D0.
> + * On platforms with ACPI _PR3 power resources, this performs a true
> D3cold
> + * power cycle (actual power removal). On platforms without _PR3
> support,
> + * pci_set_power_state() automatically falls back to D3hot, providing a
> + * D3hot->D0 reset transition.
> + *
> + * Only available for single-function devices to avoid affecting other
> + * functions in multi-function devices.
> + *
> + * Returns 0 if device can be/was reset this way, -ENOTTY if not
> supported,
> + * or other negative error code on failure.
> + */
> +static int pci_d3cold_reset(struct pci_dev *dev, bool probe)
> +{
> + int ret;
> +
> + if (dev->multifunction)
> + return -ENOTTY;
> +
> + if (probe) {
> + if (!pci_pr3_present(dev))
> + pci_dbg(dev, "d3cold reset: no _PR3 support, will use D3hot
> fallback\n");
This fall through is invalid, if D3cold can't be reached the reset should be handled by pci_pm_reset() where NoSoftRst is honored. pci_pr3_preset() should be tested in all cases, not just probe. I think we also need to test device flags that would prevent a D3cold transition, like pci_pm_reset(). Thanks,
Alex
> + return 0;
> + }
> +
> + if (dev->current_state != PCI_D0)
> + return -EINVAL;
> +
> + ret = pci_set_power_state(dev, PCI_D3cold);
> + if (ret)
> + return ret;
> +
> + return pci_set_power_state(dev, PCI_D0);
> +}
> +
> /**
> * pcie_wait_for_link_status - Wait for link status change
> * @pdev: Device whose link to wait for.
> @@ -5065,6 +5105,7 @@ const struct pci_reset_fn_method
> pci_reset_fn_methods[] = {
> { pci_pm_reset, .name = "pm" },
> { pci_reset_bus_function, .name = "bus" },
> { cxl_reset_bus_function, .name = "cxl_bus" },
> + { pci_d3cold_reset, .name = "d3cold" },
> };
>
> /**
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 2c4454583c11..1ca7b880ead7 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -51,7 +51,7 @@
> PCI_STATUS_PARITY)
>
> /* Number of reset methods used in pci_reset_fn_methods array in pci.c */
> -#define PCI_NUM_RESET_METHODS 8
> +#define PCI_NUM_RESET_METHODS 9
>
> #define PCI_RESET_PROBE true
> #define PCI_RESET_DO_RESET false
> --
> 2.53.0