Re: [PATCH v4 0/8] PCI/pwrctrl: Major rework to integrate pwrctrl devices with controller drivers

From: Shawn Lin

Date: Fri Jan 16 2026 - 03:49:15 EST


在 2026/01/16 星期五 16:30, Manivannan Sadhasivam 写道:
On Fri, Jan 16, 2026 at 04:02:38PM +0800, Shawn Lin wrote:

在 2026/01/05 星期一 21:55, Manivannan Sadhasivam via B4 Relay 写道:
Hi,

This series provides a major rework for the PCI power control (pwrctrl)
framework to enable the pwrctrl devices to be controlled by the PCI controller
drivers.

Problem Statement
=================

Currently, the pwrctrl framework faces two major issues:

1. Missing PERST# integration
2. Inability to properly handle bus extenders such as PCIe switch devices

First issue arises from the disconnect between the PCI controller drivers and
pwrctrl framework. At present, the pwrctrl framework just operates on its own
with the help of the PCI core. The pwrctrl devices are created by the PCI core
during initial bus scan and the pwrctrl drivers once bind, just power on the
PCI devices during their probe(). This design conflicts with the PCI Express
Card Electromechanical Specification requirements for PERST# timing. The reason
is, PERST# signals are mostly handled by the controller drivers and often
deasserted even before the pwrctrl drivers probe. According to the spec, PERST#
should be deasserted only after power and reference clock to the device are
stable, within predefined timing parameters.

The second issue stems from the PCI bus scan completing before pwrctrl drivers
probe. This poses a significant problem for PCI bus extenders like switches
because the PCI core allocates upstream bridge resources during the initial
scan. If the upstream bridge is not hotplug capable, resources are allocated
only for the number of downstream buses detected at scan time, which might be
just one if the switch was not powered and enumerated at that time. Later, when
the pwrctrl driver powers on and enumerates the switch, enumeration fails due to
insufficient upstream bridge resources.

Proposal
========

This series addresses both issues by introducing new individual APIs for pwrctrl
device creation, destruction, power on, and power off operations. Controller
drivers are expected to invoke these APIs during their probe(), remove(),
suspend(), and resume() operations. This integration allows better coordination
between controller drivers and the pwrctrl framework, enabling enhanced features
such as D3Cold support.

The original design aimed to avoid modifying controller drivers for pwrctrl
integration. However, this approach lacked scalability because different
controllers have varying requirements for when devices should be powered on. For
example, controller drivers require devices to be powered on early for
successful PHY initialization.

By using these explicit APIs, controller drivers gain fine grained control over
their associated pwrctrl devices.

This series modified the pcie-qcom driver (only consumer of pwrctrl framework)
to adopt to these APIs and also removed the old pwrctrl code from PCI core. This
could be used as a reference to add pwrctrl support for other controller drivers
also.

For example, to control the 3.3v supply to the PCI slot where the NVMe device is
connected, below modifications are required:

Devicetree
----------

// In SoC dtsi:

pci@1bf8000 { // controller node
...
pcie1_port0: pcie@0 { // PCI Root Port node
compatible = "pciclass,0604"; // required for pwrctrl
driver bind
...
};
};

// In board dts:

&pcie1_port0 {
reset-gpios = <&tlmm 152 GPIO_ACTIVE_LOW>; // optional
vpcie3v3-supply = <&vreg_nvme>; // NVMe power supply
};

Controller driver
-----------------

// Select PCI_PWRCTRL_SLOT in controller Kconfig

probe() {
...
// Initialize controller resources
pci_pwrctrl_create_devices(&pdev->dev);
pci_pwrctrl_power_on_devices(&pdev->dev);
// Deassert PERST# (optional)
...
pci_host_probe(); // Allocate host bridge and start bus scan
}

suspend {
// PME_Turn_Off broadcast
// Assert PERST# (optional)
pci_pwrctrl_power_off_devices(&pdev->dev);
...
}

resume {
...
pci_pwrctrl_power_on_devices(&pdev->dev);
// Deassert PERST# (optional)
}

I will add a documentation for the pwrctrl framework in the coming days to make
it easier to use.


This series looks great.

In practice, some PCIe devices may need to be powered down dynamically at
runtime. For example, users might want to disable a PCIe Wi-Fi module when
there's no internet usage — typically, commands like ifconfig wlan0 downonly
bring the interface down but leave the Wi-Fi hardware powered. Is there a
mechanism that would allow the Endpoint driver to leverage pwrctrl
dynamically to support such power management scenarios?


Glad that you've brought it up. You are talking about the usecase similar to
Airplane mode in mobiles, and we at Qcom are looking into this usecase in
upstream.

They way to handle this would be by using runtime PM ops. Once your WiFi or a
NIC driver runtime suspends, it will trigger the controller driver runtime
suspend callback. By that time, the controller driver can see if the device is
active or not (checking D states), whether wakeup is requested or not and then
initiate the D3Cold sequence using the APIs introduced in this series.

But that comes with a cost though, which is resume latency. It is generally not
advised to do D3Cold during runtime PM due to the latency and also device
lifetime issues (wearout etc...). So technically it is possible, but there are
challenges.


Indeed, that's a fundamental power-performance trade-off for battery-powered devices.

Krishna is going to post a series that allows the pcie-qcom driver to do D3Cold
during system suspend with these APIs. And we do have plans to extend it to
Airplane mode and similar usecases in the future.


Thanks for sharing this details.

- Mani