Re: [PATCH 24/25] PCI: Perform reset_resource() and build fail list in sync

From: Ilpo Järvinen
Date: Mon Apr 14 2025 - 10:24:11 EST


On Mon, 14 Apr 2025, Ondřej Jirman wrote:
> On Mon, Apr 14, 2025 at 04:15:01PM +0300, Ilpo Järvinen wrote:
> > On Mon, 14 Apr 2025, Ondřej Jirman wrote:
> > > On Mon, Apr 14, 2025 at 12:52:15PM +0300, Ilpo Järvinen wrote:
> > > > On Fri, 11 Apr 2025, Ondřej Jirman wrote:
> > > >
> > > > > Hello Ilpo,
> > > > >
> > > > > On Tue, Apr 01, 2025 at 08:38:48PM +0300, Ilpo Järvinen wrote:
> > > > > > That log wasn't taken from a bad case but it doesn't matter anymore as I
> > > > > > could test this with qemu myself, thanks for providing enough to make it
> > > > > > easy to reproduce & test it locally :-).
> > > > > >
> > > > > > The problem is caused by assign+release cycle being destructive on start
> > > > > > aligned resources because successful assigment overwrites the start field.
> > > > > > I'll send a patch to fix the problem once I've given it a bit more thought
> > > > > > as this resource fitting code is somewhat complex.
> > > > >
> > > > > BTW, same thing here on a different SoC:
> > > > >
> > > > > https://lore.kernel.org/lkml/hrcsm2bo4ysqj2ggejndlou32cdc7yiknnm5nrlcoz4d64wall@7te4dfqsoe3y/T/#u
> > > > >
> > > > > There are kernel logs there, too, although I don't have dynamic debug enabled
> > > > > in the kernel.
> > > > >
> > > > > Interestingly, bisect pointed me initially to a different commit. Reverting
> > > > > it helped, but just on one board (QuartzPro64).
> > > >
> > > > Hi,
> > > >
> > > > Since you didn't mention it, I guess you haven't tried the fix:
> > > >
> > > > https://patchwork.kernel.org/project/linux-pci/patch/20250403093137.1481-1-ilpo.jarvinen@xxxxxxxxxxxxxxx/
> > >
> > > This patch works. Thank you.
> >
> > Thanks for testing. If you feel confident enough, please send your
> > Tested-by tag to its thread (not this one).
> >
> > > One difference compared to 6.14 that I'm noticing
> > > in the kernel log is that "save config" sequences now are printed twice for
> > > unpopulated port. Not sure if it's related to your patches. Previously it was
> > > printed just once.
> >
> > I don't think it is related. The resource fitting/assignment changes
> > should not impact PCI save/restore AFAIK (except by preventing device from
> > working in the first place if necessary resources do not get assigned).
> >
> > PCI state saving has always seemed like that in most logs I've seen with
> > dyndbg enabled, that is, the state is seemingly saved multiple times,
> > often right after the previous save too. So TBH, I'm not convinced it's
> > even something new. If you have backtraces to show those places that
> > invoke pci_save_state() I can take a look but I don't expect much to
> > come out of that.
>
> You're right. It's unrelated. The traces:
>
> [ 1.878732] show_stack+0x28/0x80 (C)
> [ 1.878751] dump_stack_lvl+0x58/0x74
> [ 1.878762] dump_stack+0x14/0x1c
> [ 1.878772] pci_save_state+0x34/0x1e8
> [ 1.878783] pcie_portdrv_probe+0x330/0x6a8
> [ 1.878794] local_pci_probe+0x3c/0xa0
> [ 1.878804] pci_device_probe+0xac/0x194
> [ 1.878813] really_probe+0xbc/0x388
> [ 1.878825] __driver_probe_device+0x78/0x144
> [ 1.878837] driver_probe_device+0x38/0x110
> [ 1.878850] __device_attach_driver+0xb0/0x150
> [ 1.878862] bus_for_each_drv+0x6c/0xb0
> [ 1.878873] __device_attach+0x98/0x1a0
> [ 1.878886] device_attach+0x10/0x20
> [ 1.878897] pci_bus_add_device+0x84/0x12c
> [ 1.878911] pci_bus_add_devices+0x40/0x90
> [ 1.878924] pci_host_probe+0x88/0xe4
> [ 1.878933] dw_pcie_host_init+0x258/0x714
> [ 1.878945] rockchip_pcie_probe+0x2f0/0x3f8
> [ 1.878956] platform_probe+0x64/0xcc
> [ 1.878965] really_probe+0xbc/0x388
> [ 1.878977] __driver_probe_device+0x78/0x144
> [ 1.878989] driver_probe_device+0x38/0x110
> [ 1.879001] __device_attach_driver+0xb0/0x150
> [ 1.879014] bus_for_each_drv+0x6c/0xb0
> [ 1.879025] __device_attach+0x98/0x1a0
> [ 1.879037] device_initial_probe+0x10/0x20
> [ 1.879049] bus_probe_device+0xa8/0xb8
> [ 1.879061] deferred_probe_work_func+0xa4/0xf0
> [ 1.879072] process_one_work+0x13c/0x2bc
> [ 1.879084] worker_thread+0x284/0x460
> [ 1.879094] kthread+0xe4/0x1a0
> [ 1.879107] ret_from_fork+0x10/0x20
> [ 1.879121] pcieport 0002:20:00.0: save config 0x00: 0x35881d87
> [ 1.879158] pcieport 0002:20:00.0: save config 0x04: 0x00100507
> [ 1.879168] pcieport 0002:20:00.0: save config 0x08: 0x06040001
> [ 1.879177] pcieport 0002:20:00.0: save config 0x0c: 0x00010000
> [ 1.879221] pcieport 0002:20:00.0: save config 0x10: 0x00000000
> [ 1.879232] pcieport 0002:20:00.0: save config 0x14: 0x00000000
> [ 1.879242] pcieport 0002:20:00.0: save config 0x18: 0x00212120
> [ 1.879251] pcieport 0002:20:00.0: save config 0x1c: 0x000000f0
> [ 1.879260] pcieport 0002:20:00.0: save config 0x20: 0x0000fff0
> [ 1.879270] pcieport 0002:20:00.0: save config 0x24: 0x0001fff1
> [ 1.879279] pcieport 0002:20:00.0: save config 0x28: 0x00000000
> [ 1.879289] pcieport 0002:20:00.0: save config 0x2c: 0x00000000
> [ 1.879298] pcieport 0002:20:00.0: save config 0x30: 0x00000000
> [ 1.879307] pcieport 0002:20:00.0: save config 0x34: 0x00000040
> [ 1.879316] pcieport 0002:20:00.0: save config 0x38: 0x00000000
> [ 1.879325] pcieport 0002:20:00.0: save config 0x3c: 0x00020194
> [ 1.879376] pcieport 0002:20:00.0: vgaarb: pci_notify
>
> second time it's from:
>
> [ 2.004478] Workqueue: pm pm_runtime_work
> [ 2.004488] Call trace:
> [ 2.004491] show_stack+0x28/0x80 (C)
> [ 2.004500] dump_stack_lvl+0x58/0x74
> [ 2.004506] dump_stack+0x14/0x1c
> [ 2.004511] pci_save_state+0x34/0x1e8
> [ 2.004516] pci_pm_runtime_suspend+0xa8/0x1e0
> [ 2.004521] __rpm_callback+0x3c/0x220
> [ 2.004527] rpm_callback+0x6c/0x80
> [ 2.004532] rpm_suspend+0xec/0x674
> [ 2.004537] pm_runtime_work+0x104/0x114
> [ 2.004542] process_one_work+0x13c/0x2bc
> [ 2.004548] worker_thread+0x284/0x460
> [ 2.004552] kthread+0xe4/0x1a0
> [ 2.004559] ret_from_fork+0x10/0x20
> [ 2.004567] pcieport 0002:20:00.0: save config 0x00: 0x35881d87
> [ 2.004578] pcieport 0002:20:00.0: save config 0x04: 0x00100507
> [ 2.004583] pcieport 0002:20:00.0: save config 0x08: 0x06040001
> [ 2.004587] pcieport 0002:20:00.0: save config 0x0c: 0x00010000
> [ 2.004591] pcieport 0002:20:00.0: save config 0x10: 0x00000000
> [ 2.004596] pcieport 0002:20:00.0: save config 0x14: 0x00000000
> [ 2.004600] pcieport 0002:20:00.0: save config 0x18: 0x00212120
> [ 2.004604] pcieport 0002:20:00.0: save config 0x1c: 0x000000f0
> [ 2.004608] pcieport 0002:20:00.0: save config 0x20: 0x0000fff0
> [ 2.004613] pcieport 0002:20:00.0: save config 0x24: 0x0001fff1
> [ 2.004617] pcieport 0002:20:00.0: save config 0x28: 0x00000000
> [ 2.004621] pcieport 0002:20:00.0: save config 0x2c: 0x00000000
> [ 2.004625] pcieport 0002:20:00.0: save config 0x30: 0x00000000
> [ 2.004629] pcieport 0002:20:00.0: save config 0x34: 0x00000040
> [ 2.004633] pcieport 0002:20:00.0: save config 0x38: 0x00000000
> [ 2.004638] pcieport 0002:20:00.0: save config 0x3c: 0x00020194
> [ 2.004662] pcieport 0002:20:00.0: PME# enabled
>
> on 6.15-rc:
>
> cat /sys/class/pci_bus/0002:20/device/0002:20:00.0/power/runtime_status OK
> suspended
>
> on 6.14:
>
> cat /sys/class/pci_bus/0002:20/device/0002:20:00.0/power/runtime_status OK
> active
>
> So some other unrelated change in 6.15-rc just enabled RPM to suspend the unused
> port.

Ah right, makes sense, now that you mention I recall D3 became allowed for
PCI bridge on non-x86 archs but I just couldn't make the connection in my
mind.

--
i.