Re: [PATCH] vfio/pci: Allow disabling idle D3 on a per-device basis

From: Alex Williamson

Date: Wed Apr 22 2026 - 17:52:09 EST


On Wed, 22 Apr 2026 04:13:07 -0400
lirongqing <lirongqing@xxxxxxxxx> wrote:

> From: Li RongQing <lirongqing@xxxxxxxxx>
>
> The disable_idle_d3 module parameter currently toggles idle D3 power
> management for all devices handled by vfio-pci. This is too coarse for
> environments where only specific devices (e.g., certain GPUs or NICs)
> have issues with D3 state transition.
>
> For example, some PCIe devices exhibit hardware bugs or firmware issues
> when entering or exiting D3 state. These devices may experience PCIe link
> speed degradation after transitioning out of D3, reducing from Gen4/Gen5
> to lower speeds, which can significantly impact I/O bandwidth. In such
> cases, only these problematic devices need to have idle D3 disabled,
> rather than all devices globally.
>
> Introduce a new module parameter 'disable_idle_d3_ids' to allow users to
> specify a list of vendor:device IDs that should have idle D3 disabled.
>
> To support this, add a 'disable_idle_d3' flag to struct
> vfio_pci_core_device. This flag is initialized during device probe
> based on both the global 'disable_idle_d3' parameter and the new
> 'disable_idle_d3_ids' list. All runtime PM decisions are then shifted
> to use this per-device flag.
>
> In vfio_pci_dev_set_try_reset(), update the logic to iterate through
> all devices in the dev_set and respect their individual D3 settings
> when performing a bus reset.PCI_DEV_FLAGS_NO_D3

There are device flags that can be set by quirks to handle this:

enum pci_dev_flags {
...
/* Device configuration is irrevocably lost if disabled into D3 */
PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) (1 << 1),
...
/* Do not use PM reset even if device advertises NoSoftRst- */
PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7),

Ideally vfio-pci.disable_idle_d3 would be your debug tool for
evaluating issues with device level D3 support. If an incompatible
device is found, we should attempt to resolve issues, like link
re-training, or at least contribute a quirk for the device so that all
users benefit, not just those with a magic list of broken devices.

You also have the reset_method sysfs attribute at your disposal to
manage how we trigger a function scoped reset. Thanks,

Alex