Hi Krishna,
On 03.10.2024 08:02, Krishna chaitanya chundru wrote:
The Controller driver is the parent device of the PCIe host bridge,
PCI-PCI bridge and PCIe endpoint as shown below.
PCIe controller(Top level parent & parent of host bridge)
|
v
PCIe Host bridge(Parent of PCI-PCI bridge)
|
v
PCI-PCI bridge(Parent of endpoint driver)
|
v
PCIe endpoint driver
Now, when the controller device goes to runtime suspend, PM framework
will check the runtime PM state of the child device (host bridge) and
will find it to be disabled. So it will allow the parent (controller
device) to go to runtime suspend. Only if the child device's state was
'active' it will prevent the parent to get suspended.
It is a property of the runtime PM framework that it can only
follow continuous dependency chains. That is, if there is a device
with runtime PM disabled in a dependency chain, runtime PM cannot be
enabled for devices below it and above it in that chain both at the
same time.
Since runtime PM is disabled for host bridge, the state of the child
devices under the host bridge is not taken into account by PM framework
for the top level parent, PCIe controller. So PM framework, allows
the controller driver to enter runtime PM irrespective of the state
of the devices under the host bridge. And this causes the topology
breakage and also possible PM issues like controller driver goes to
runtime suspend while endpoint driver is doing some transfers.
Because of the above, in order to enable runtime PM for a PCIe
controller device, one needs to ensure that runtime PM is enabled for
all devices in every dependency chain between it and any PCIe endpoint
(as runtime PM is enabled for PCIe endpoints).
This means that runtime PM needs to be enabled for the host bridge
device, which is present in all of these dependency chains.
After this change, the host bridge device will be runtime-suspended
by the runtime PM framework automatically after suspending its last
child and it will be runtime-resumed automatically before resuming its
first child which will allow the runtime PM framework to track
dependencies between the host bridge device and all of its
descendants.
Signed-off-by: Krishna chaitanya chundru <quic_krichai@xxxxxxxxxxx>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
This patch landed in today's linux-next as commit 02787a3b4d10 ("PCI/PM:
Enable runtime power management for host bridges"). In my tests I found
that it triggers a warning on StarFive VisionFive2 RISC-V board. It
looks that some more changes are needed in the dwc-pci driver or so.
There is a message from runtime pm subsystem about inactive device with
active children and suspicious locking pattern. Here is the log I
observed on that board:
---->8---
pcie-starfive 940000000.pcie: port link up
pcie-starfive 940000000.pcie: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [mem 0x30000000-0x37ffffff]
pci_bus 0000:00: root bus resource [mem 0x900000000-0x93fffffff pref]
pci 0000:00:00.0: [1556:1111] type 01 class 0x060400 PCIe Root Port
pci 0000:00:00.0: PCI bridge to [bus 00]
pci 0000:00:00.0: bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref]
pci 0000:00:00.0: supports D1 D2
pci 0000:00:00.0: PME# supported from D0 D1 D2 D3hot D3cold
pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:01:00.0: [1106:3483] type 00 class 0x0c0330 PCIe Endpoint
pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00000fff 64bit]
pci 0000:01:00.0: PME# supported from D0 D3cold
pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
pci 0000:00:00.0: bridge window [mem 0x30000000-0x300fffff]: assigned
pci 0000:01:00.0: BAR 0 [mem 0x30000000-0x30000fff 64bit]: assigned
pci 0000:00:00.0: PCI bridge to [bus 01]
pci 0000:00:00.0: bridge window [mem 0x30000000-0x300fffff]
pci_bus 0000:00: resource 4 [mem 0x30000000-0x37ffffff]
pci_bus 0000:00: resource 5 [mem 0x900000000-0x93fffffff pref]
pci_bus 0000:01: resource 1 [mem 0x30000000-0x300fffff]
pcieport 0000:00:00.0: enabling device (0000 -> 0002)
pcieport 0000:00:00.0: PME: Signaling with IRQ 53
pci 0000:01:00.0: enabling device (0000 -> 0002)
xhci_hcd 0000:01:00.0: xHCI Host Controller
xhci_hcd 0000:01:00.0: new USB bus registered, assigned bus number 1
xhci_hcd 0000:01:00.0: hcc params 0x002841eb hci version 0x100 quirks
0x0000000000000890
xhci_hcd 0000:01:00.0: xHCI Host Controller
xhci_hcd 0000:01:00.0: new USB bus registered, assigned bus number 2
xhci_hcd 0000:01:00.0: Host supports USB 3.0 SuperSpeed
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 1 port detected
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 4 ports detected
pcie-starfive 940000000.pcie: Enabling runtime PM for inactive device
with active children
======================================================
WARNING: possible circular locking dependency detected
6.12.0-rc1+ #15438 Not tainted
------------------------------------------------------
systemd-udevd/159 is trying to acquire lock:
ffffffff81822520 (console_owner){-.-.}-{0:0}, at:
console_lock_spinning_enable+0x3a/0x60
but task is already holding lock:
ffffffd6c0b3d980 (&dev->power.lock){-...}-{2:2}, at:
pm_runtime_enable+0x1e/0xb6
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&dev->power.lock){-...}-{2:2}:
lock_acquire.part.0+0xa2/0x1d4
lock_acquire+0x44/0x5a
_raw_spin_lock_irqsave+0x3a/0x64
__pm_runtime_resume+0x40/0x86
__uart_start+0x40/0xb2
uart_write+0x90/0x220
n_tty_write+0x10a/0x40e
file_tty_write.constprop.0+0x10c/0x230
redirected_tty_write+0x84/0xbc
do_iter_readv_writev+0x100/0x166
vfs_writev+0xc6/0x398
do_writev+0x5c/0xca
__riscv_sys_writev+0x16/0x1e
do_trap_ecall_u+0x1b6/0x1e2
_new_vmalloc_restore_context_a0+0xc2/0xce
-> #1 (&port_lock_key){-.-.}-{2:2}:
lock_acquire.part.0+0xa2/0x1d4
lock_acquire+0x44/0x5a
_raw_spin_lock_irqsave+0x3a/0x64
serial8250_console_write+0x2a0/0x474
univ8250_console_write+0x22/0x2a
console_flush_all+0x2f6/0x3c8
console_unlock+0x80/0x1a8
vprintk_emit+0x10e/0x2e0
vprintk_default+0x16/0x1e
vprintk+0x1e/0x3c
_printk+0x36/0x50
register_console+0x292/0x418
serial_core_register_port+0x6d6/0x6dc
serial_ctrl_register_port+0xc/0x14
uart_add_one_port+0xc/0x14
serial8250_register_8250_port+0x288/0x428
dw8250_probe+0x422/0x518
platform_probe+0x4e/0x92
really_probe+0x10a/0x2da
__driver_probe_device.part.0+0xb2/0xe8
driver_probe_device+0x78/0xc4
__device_attach_driver+0x66/0xc6
bus_for_each_drv+0x5c/0xb0
__device_attach+0x84/0x13c
device_initial_probe+0xe/0x16
bus_probe_device+0x88/0x8a
deferred_probe_work_func+0xd4/0xee
process_one_work+0x1e0/0x534
worker_thread+0x166/0x2cc
kthread+0xc4/0xe0
ret_from_fork+0xe/0x18
-> #0 (console_owner){-.-.}-{0:0}:
check_noncircular+0x10e/0x122
__lock_acquire+0x105c/0x1f4a
lock_acquire.part.0+0xa2/0x1d4
lock_acquire+0x44/0x5a
console_lock_spinning_enable+0x58/0x60
console_flush_all+0x2cc/0x3c8
console_unlock+0x80/0x1a8
vprintk_emit+0x10e/0x2e0
dev_vprintk_emit+0xea/0x112
dev_printk_emit+0x2e/0x48
__dev_printk+0x40/0x5c
_dev_warn+0x46/0x60
pm_runtime_enable+0x98/0xb6
starfive_pcie_probe+0x12e/0x228 [pcie_starfive]
platform_probe+0x4e/0x92
really_probe+0x10a/0x2da
__driver_probe_device.part.0+0xb2/0xe8
driver_probe_device+0x78/0xc4
__driver_attach+0x54/0x162
bus_for_each_dev+0x58/0xa4
driver_attach+0x1a/0x22
bus_add_driver+0xec/0x1ce
driver_register+0x3e/0xd8
__platform_driver_register+0x1c/0x24
starfive_pcie_driver_init+0x20/0x1000 [pcie_starfive]
do_one_initcall+0x5e/0x28c
do_init_module+0x52/0x1ba
load_module+0x1440/0x18f0
init_module_from_file+0x76/0xae
idempotent_init_module+0x18c/0x24a
__riscv_sys_finit_module+0x52/0x82
do_trap_ecall_u+0x1b6/0x1e2
_new_vmalloc_restore_context_a0+0xc2/0xce
other info that might help us debug this:
Chain exists of:
console_owner --> &port_lock_key --> &dev->power.lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&dev->power.lock);
lock(&port_lock_key);
lock(&dev->power.lock);
lock(console_owner);
*** DEADLOCK ***
4 locks held by systemd-udevd/159:
#0: ffffffd6c0b3d8f8 (&dev->mutex){....}-{3:3}, at:
__driver_attach+0x4c/0x162
#1: ffffffd6c0b3d980 (&dev->power.lock){-...}-{2:2}, at:
pm_runtime_enable+0x1e/0xb6
#2: ffffffff818223b0 (console_lock){+.+.}-{0:0}, at:
dev_vprintk_emit+0xea/0x112
#3: ffffffff81822448 (console_srcu){....}-{0:0}, at:
console_flush_all+0x4e/0x3c8
stack backtrace:
CPU: 1 UID: 0 PID: 159 Comm: systemd-udevd Not tainted 6.12.0-rc1+ #15438
Hardware name: StarFive VisionFive 2 v1.2A (DT)
Call Trace:
[<ffffffff80006a02>] dump_backtrace+0x1c/0x24
[<ffffffff80b70b3e>] show_stack+0x2c/0x38
[<ffffffff80b7f8f8>] dump_stack_lvl+0x7a/0xb4
[<ffffffff80b7f946>] dump_stack+0x14/0x1c
[<ffffffff8007fbc2>] print_circular_bug+0x2aa/0x350
[<ffffffff8007fd76>] check_noncircular+0x10e/0x122
[<ffffffff80082a3c>] __lock_acquire+0x105c/0x1f4a
[<ffffffff80084148>] lock_acquire.part.0+0xa2/0x1d4
[<ffffffff800842be>] lock_acquire+0x44/0x5a
[<ffffffff8008b3e8>] console_lock_spinning_enable+0x58/0x60
[<ffffffff8008c0c2>] console_flush_all+0x2cc/0x3c8
[<ffffffff8008c23e>] console_unlock+0x80/0x1a8
[<ffffffff8008c710>] vprintk_emit+0x10e/0x2e0
[<ffffffff80b79ec8>] dev_vprintk_emit+0xea/0x112
[<ffffffff80b79f1e>] dev_printk_emit+0x2e/0x48
[<ffffffff80b7a006>] __dev_printk+0x40/0x5c
[<ffffffff80b7a2be>] _dev_warn+0x46/0x60
[<ffffffff807037ae>] pm_runtime_enable+0x98/0xb6
[<ffffffff02763240>] starfive_pcie_probe+0x12e/0x228 [pcie_starfive]
[<ffffffff806f83f6>] platform_probe+0x4e/0x92
[<ffffffff80b7a680>] really_probe+0x10a/0x2da
[<ffffffff80b7a902>] __driver_probe_device.part.0+0xb2/0xe8
[<ffffffff806f6112>] driver_probe_device+0x78/0xc4
[<ffffffff806f6278>] __driver_attach+0x54/0x162
[<ffffffff806f42e6>] bus_for_each_dev+0x58/0xa4
[<ffffffff806f5c9e>] driver_attach+0x1a/0x22
[<ffffffff806f54ce>] bus_add_driver+0xec/0x1ce
[<ffffffff806f7112>] driver_register+0x3e/0xd8
[<ffffffff806f80cc>] __platform_driver_register+0x1c/0x24
[<ffffffff027ea020>] starfive_pcie_driver_init+0x20/0x1000 [pcie_starfive]
[<ffffffff800027ba>] do_one_initcall+0x5e/0x28c
[<ffffffff800bb4d8>] do_init_module+0x52/0x1ba
[<ffffffff800bcc8a>] load_module+0x1440/0x18f0
[<ffffffff800bd2f0>] init_module_from_file+0x76/0xae
[<ffffffff800bd4b4>] idempotent_init_module+0x18c/0x24a
[<ffffffff800bd5fc>] __riscv_sys_finit_module+0x52/0x82
[<ffffffff80b804a0>] do_trap_ecall_u+0x1b6/0x1e2
[<ffffffff80b8c536>] _new_vmalloc_restore_context_a0+0xc2/0xce
pcie-starfive 940000000.pcie: driver: 'pcie-starfive': driver_bound:
bound to device
/soc/pcie@940000000 Dropping the fwnode link to
/soc/pcie@940000000/interrupt-controller
pcie-starfive 940000000.pcie: Dropping the link to 10210000.phy
device: 'platform:10210000.phy--platform:940000000.pcie': device_unregister
pcie-starfive 940000000.pcie: Dropping the link to 10230000.clock-controller
device: 'platform:10230000.clock-controller--platform:940000000.pcie':
device_unregister
pcie-starfive 940000000.pcie: bus: 'platform': really_probe: bound
device to driver pcie-starfive
platform 9c0000000.pcie: bus: 'platform': __driver_probe_device: matched
device with driver pcie-starfive
platform 9c0000000.pcie: bus: 'platform': really_probe: probing driver
pcie-starfive with device
pcie-starfive 9c0000000.pcie: no init pinctrl state
pcie-starfive 9c0000000.pcie: no sleep pinctrl state
pcie-starfive 9c0000000.pcie: no idle pinctrl state
device: 'phy:phy-10220000.phy.1--platform:9c0000000.pcie': device_add
devices_kset: Moving 9c0000000.pcie to end of list
PM: Moving platform:9c0000000.pcie to end of list
pcie-starfive 9c0000000.pcie: Linked as a consumer to phy-10220000.phy.1
pcie-starfive 9c0000000.pcie: host bridge /soc/pcie@9c0000000 ranges:
pcie-starfive 9c0000000.pcie: MEM 0x0038000000..0x003fffffff ->
0x0038000000
pcie-starfive 9c0000000.pcie: MEM 0x0980000000..0x09bfffffff ->
0x0980000000
--->8---
---
Changes in v5:
- call pm_runtime_no_callbacks() as suggested by Rafael.
- include the commit texts as suggested by Rafael.
- Link to v4: https://lore.kernel.org/linux-pci/20240708-runtime_pm-v4-1-c02a3663243b@xxxxxxxxxxx/
Changes in v4:
- Changed pm_runtime_enable() to devm_pm_runtime_enable() (suggested by mayank)
- Link to v3: https://lore.kernel.org/lkml/20240609-runtime_pm-v3-1-3d0460b49d60@xxxxxxxxxxx/
Changes in v3:
- Moved the runtime API call's from the dwc driver to PCI framework
as it is applicable for all (suggested by mani)
- Updated the commit message.
- Link to v2: https://lore.kernel.org/all/20240305-runtime_pm_enable-v2-1-a849b74091d1@xxxxxxxxxxx
Changes in v2:
- Updated commit message as suggested by mani.
- Link to v1: https://lore.kernel.org/r/20240219-runtime_pm_enable-v1-1-d39660310504@xxxxxxxxxxx
---
---
drivers/pci/probe.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4f68414c3086..8409e1dde0d1 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3106,6 +3106,11 @@ int pci_host_probe(struct pci_host_bridge *bridge)
pcie_bus_configure_settings(child);
pci_bus_add_devices(bus);
+
+ pm_runtime_set_active(&bridge->dev);
+ pm_runtime_no_callbacks(&bridge->dev);
+ devm_pm_runtime_enable(&bridge->dev);
+
return 0;
}
EXPORT_SYMBOL_GPL(pci_host_probe);
---
base-commit: c02d24a5af66a9806922391493205a344749f2c4
change-id: 20241003-runtime_pm-655d48356c8b
Best regards,
Best regards