Re: Bug#1131025: [6.12.y regression] Regression with 58130e7ce6cb ("PCI/ERR: Ensure error recoverability at all times"): echo vfio-pci >driver_override does not work for DVB Adapter

From: Alexandre N.

Date: Sun Apr 12 2026 - 14:04:56 EST


On 4/4/26 11:54, Bernd Schumacher wrote:
Am Freitag, dem 03.04.2026 um 16:58 +0200 schrieb Lukas Wunner:
If you cherry-pick these two upstream commits onto v6.12.73,
does the issue go away?

  4d4c10f763d7 ("PCI: Explicitly put devices into D0 when
initializing")
  907a7a2e5bf4 ("PCI/PM: Set up runtime PM even for devices without
PCI PM")


Yes, with the two patches together it works!!!

I am not sure, if you need the dmesg. But I have attached it.

Thanks!
Bernd

Hi,

(This is my first post to a kernel mailing list, apologies if I've
made any formatting or etiquette mistakes.)

Independent confirmation on different hardware and a different
stable branch, plus test results for the proposed fix on 6.19.11.

Hardware: PCI 1b4b:9215 Marvell 88SE9215 PCIe 2.0 x1 4-port
SATA 6 Gb/s controller, whole-device passthrough to a
Windows 10 guest via QEMU/libvirt on
an AMD Ryzen 7 7700 8-Core x86_64 host (Arch Linux).
Last good: linux 6.18.9
First bad: linux 6.18.13 (contains stable backport 71c50e60421b
of upstream a2f1e22390ac, "PCI/ERR:
Ensure error recoverability at all
times", first backported in 6.18.10)
Also bad: linux 6.19.11 (mainline carries a2f1e22390ac)

Confirmed by rolling linux back to 6.18.9 with everything else
untouched: problem vanishes. Rolling forward to 6.18.13 or any
later versions in 6.18/6.19 reproduces it everytime.

Symptom
-------
Guest side:

* Stock Microsoft storahci.sys: Device Manager shows the controller
with status "This device cannot start. (Code 10) An I/O adapter
hardware error has occurred." None of the disks wired to the
controller are visible.
* Vendor Marvell 92xx storage driver: Device Manager reports the
controller as healthy and the driver loads, but the physically
attached disks are not detected.

Host side (before applying the pci-stub workaround described below),
a subsequent libvirt-managed detach/rebind cycle on VM shutdown
pagefaults the host in ahci's probe path:

BUG: unable to handle page fault for address: ...
RIP: 0010:ahci_save_initial_config+0x1aa/0x2e0
Call Trace:
ahci_init_one+0x3a7/0xbf0
local_pci_probe+0x46/0xa0
...

virt-manager becomes unresponsive, any command to bind/unbind the
driver or reset the device will hang. When shutting down the host,
the system will log a shutdown sequence but will never actually
turn off. A hard shutdown (long press on the power button)
is necessary.

On 6.18.9 the same libvirt detach/rebind round trip through ahci
is clean, there's no page fault error, virt-manager is fine and
the host can shutdown properly.

Testing the proposed fix on 6.19.11
-----------------------------------
Cherry-picked both fix commits identified in this thread:

4d4c10f763d7 ("PCI: Explicitly put devices into D0 when
initializing")
907a7a2e5bf4 ("PCI/PM: Set up runtime PM even for devices without
PCI PM")

onto v6.19.11. The resulting diff against the unpatched 6.19.11 tree
is minimal with only one added line (pm_runtime_enable after
pm_runtime_set_active in pci_pm_init). The guest-side regression
persists: still Code 10 with storahci, still empty SATA ports
with the Marvell vendor driver.

I'd be happy to help to test further patches if that would help
narrow down what the 88SE9215 additionally needs.

Workaround for other affected users searching for this symptom
--------------------------------------------------------------
Pin linux at a version before a2f1e22390ac was backported (6.18.9
in my case). Additionally, add pci-stub.ids=1b4b:9215 to the
kernel command line, which ensures ahci driver does not load for
the SATA controller. On the pinned kernel, both the guest-side
and host-side symptoms are absent.

I can provide additional info or logs if you need.

Thanks!

Alexandre N.