Re: [PATCH] PCI: Add sysfs attribute for PCI device power state

From: Bjorn Helgaas
Date: Wed Nov 18 2020 - 14:20:12 EST


[+cc Krzysztof, Rafael in case you have a suggestion about the
filename (or anythnig else :))]

On Mon, Nov 02, 2020 at 03:15:20PM +0100, Maximilian Luz wrote:
> While most PCI power-states can be queried from user-space via lspci,
> this has some limits. Specifically, lspci fails to provide an accurate
> value when the device is in D3cold as it has to resume the device before
> it can access its power state via the configuration space, leading to it
> reporting D0 or another on-state. Thus lspci can, for example, not be
> used to diagnose power-consumption issues for devices that can enter
> D3cold or to ensure that devices properly enter D3cold at all.
>
> To alleviate this issue, introduce a new sysfs device attribute for the
> PCI power state, showing the current power state as seen by the kernel.
>
> Signed-off-by: Maximilian Luz <luzmaximilian@xxxxxxxxx>

Applied as below to pci/pm for v5.11.

> ---
> Documentation/ABI/testing/sysfs-bus-pci | 9 +++++++++
> drivers/pci/pci-sysfs.c | 12 ++++++++++++
> 2 files changed, 21 insertions(+)
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index 450296cc7948..881040af2611 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -360,3 +360,12 @@ Contact: Heiner Kallweit <hkallweit1@xxxxxxxxx>
> Description: If ASPM is supported for an endpoint, these files can be
> used to disable or enable the individual power management
> states. Write y/1/on to enable, n/0/off to disable.
> +
> +What: /sys/bus/pci/devices/.../power_state

I guess this will be alongside the existing "power/" directory.
Rafael, is there any precedent we should be following here?

> +Date: November 2020
> +Contact: Linux PCI developers <linux-pci@xxxxxxxxxxxxxxx>
> +Description:
> + This file contains the current PCI power state of the device.
> + The value comes from the PCI kernel device state and can be one
> + of: "unknown", "error", "D0", D1", "D2", "D3hot", "D3cold".
> + The file is read only.
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index d15c881e2e7e..b15f754e6346 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -124,6 +124,17 @@ static ssize_t cpulistaffinity_show(struct device *dev,
> }
> static DEVICE_ATTR_RO(cpulistaffinity);
>
> +/* PCI power state */
> +static ssize_t power_state_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct pci_dev *pci_dev = to_pci_dev(dev);
> + pci_power_t state = READ_ONCE(pci_dev->current_state);
> +
> + return sprintf(buf, "%s\n", pci_power_name(state));
> +}
> +static DEVICE_ATTR_RO(power_state);
> +
> /* show resources */
> static ssize_t resource_show(struct device *dev, struct device_attribute *attr,
> char *buf)
> @@ -581,6 +592,7 @@ static ssize_t driver_override_show(struct device *dev,
> static DEVICE_ATTR_RW(driver_override);
>
> static struct attribute *pci_dev_attrs[] = {
> + &dev_attr_power_state.attr,
> &dev_attr_resource.attr,
> &dev_attr_vendor.attr,
> &dev_attr_device.attr,

commit 9f1c0ebea21a ("PCI: Add sysfs attribute for device power state")
Author: Maximilian Luz <luzmaximilian@xxxxxxxxx>
Date: Mon Nov 2 15:15:20 2020 +0100

PCI: Add sysfs attribute for device power state

While PCI power states D0-D3hot can be queried from user-space via lspci,
D3cold cannot. lspci cannot provide an accurate value when the device is
in D3cold as it has to restore the device to D0 before it can access its
power state via the configuration space, leading to it reporting D0 or
another on-state. Thus lspci cannot be used to diagnose power consumption
issues for devices that can enter D3cold or to ensure that devices properly
enter D3cold at all.

Add a new sysfs device attribute for the PCI power state, showing the
current power state as seen by the kernel.

[bhelgaas: drop READ_ONCE(), see discussion at the link]
Link: https://lore.kernel.org/r/20201102141520.831630-1-luzmaximilian@xxxxxxxxx
Signed-off-by: Maximilian Luz <luzmaximilian@xxxxxxxxx>
Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 450296cc7948..881040af2611 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -360,3 +360,12 @@ Contact: Heiner Kallweit <hkallweit1@xxxxxxxxx>
Description: If ASPM is supported for an endpoint, these files can be
used to disable or enable the individual power management
states. Write y/1/on to enable, n/0/off to disable.
+
+What: /sys/bus/pci/devices/.../power_state
+Date: November 2020
+Contact: Linux PCI developers <linux-pci@xxxxxxxxxxxxxxx>
+Description:
+ This file contains the current PCI power state of the device.
+ The value comes from the PCI kernel device state and can be one
+ of: "unknown", "error", "D0", D1", "D2", "D3hot", "D3cold".
+ The file is read only.
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index d15c881e2e7e..fb072f4b3176 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -124,6 +124,15 @@ static ssize_t cpulistaffinity_show(struct device *dev,
}
static DEVICE_ATTR_RO(cpulistaffinity);

+static ssize_t power_state_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ return sprintf(buf, "%s\n", pci_power_name(pdev->current_state));
+}
+static DEVICE_ATTR_RO(power_state);
+
/* show resources */
static ssize_t resource_show(struct device *dev, struct device_attribute *attr,
char *buf)
@@ -581,6 +590,7 @@ static ssize_t driver_override_show(struct device *dev,
static DEVICE_ATTR_RW(driver_override);

static struct attribute *pci_dev_attrs[] = {
+ &dev_attr_power_state.attr,
&dev_attr_resource.attr,
&dev_attr_vendor.attr,
&dev_attr_device.attr,