Re: [PATCH v2 0/2] PCI/ASPM: Enable ASPM and Clock PM by default on devicetree platforms
From: Bjorn Helgaas
Date: Tue Nov 11 2025 - 18:33:15 EST
On Tue, Nov 11, 2025 at 03:51:03AM -0300, Val Packett wrote:
> On 11/8/25 1:18 PM, Dmitry Baryshkov wrote:
> > On Mon, Sep 22, 2025 at 09:46:43PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > > Hi,
> > >
> > > This series is one of the 'let's bite the bullet' kind, where we have decided to
> > > enable all ASPM and Clock PM states by default on devicetree platforms [1]. The
> > > reason why devicetree platforms were chosen because, it will be of minimal
> > > impact compared to the ACPI platforms. So seemed ideal to test the waters.
> > >
> > > This series is tested on Lenovo Thinkpad T14s based on Snapdragon X1 SoC. All
> > > supported ASPM states are getting enabled for both the NVMe and WLAN devices by
> > > default.
> > > [..]
> > The series breaks the DRM CI on DB820C board (apq8096, PCIe network
> > card, NFS root). The board resets randomly after some time ([1]).
>
> Is that reset.. due to the watchdog resetting a hard-frozen system?
>
> Me and a bunch of other people in the #aarch64-laptops irc/matrix room have
> been experiencing these random hard freezes with ASPM enabled for the NVMe
> SSD, on Hamoa (and Purwa too I think) devices.
I don't know what controllers are in Hamoa and Purwa or what the IDs
of the root ports and endpoints are. Can you collect the Vendor and
Device IDs (from dmesg log or "lspci -n")? If we figure out that some
are broken, we might be able to add quirks to avoid any broken ASPM
states.
> I have confirmed with a modified (to accept args) enable-aspm.sh script[1]
> that disabling ASPM *only* for the SSD, while keeping it *on* for the WiFi
> adapter, is enough to keep the system stable (got to about a month of uptime
> in that state).
>
> If you have reproduced the same issue on an entirely different SoC, it's
> probably a general driver issue.
>
> Please, please help us debug this using your internal secret debug equipment
> :)
>
>
> [1]: https://gist.github.com/valpackett/8a6207b44364de6b32652f4041fe680f
Can you use "echo 1 > /sys/bus/pci/devices/.../link/l0s_aspm" and
similar (see Documentation/ABI/testing/sysfs-bus-pci) to do this
tuning instead of poking with setpci? If so, it might be easier.
There are ordering requirements that aspm.c tries to observe via the
sysfs interface.
enable-aspm.sh might observe them also (I didn't look that carefully),
but if aspm.c gets them wrong, they're wrong for everybody, so we'd
like to know about that.
Bjorn