Re: [PATCH v4 01/10] PCI: Avoid saving error values for config space

From: Lukas Wunner

Date: Sat Nov 22 2025 - 05:58:07 EST

On Mon, Oct 20, 2025 at 10:59:48AM +0200, Niklas Schnelle wrote:
> Yeah I think we're talking past each other a bit. In my mind we're
> really not doing the recovery in ->error_detected() at all. Within that
> callback we only do the notify, and then do nothing in the rest of
> recovery. Only after will the guest do recovery though I do see your
> point that leaving the device in the error state kind of means that
> recovery is still ongoing even if we're not in the recovery handler
> anymore. But then any driver could also just return
> PCI_ERS_RESULT_RECOVERED in error_detected() and land us in the same
> situation.

That would be a bug in the driver. The point of the pci_error_handlers
is to attempt recovery of the device in concert with the driver.
If the driver "fakes" a recovered device towards the PCI core and then
attempts recovery behind the PCI core's back, it gets to keep the pieces...

> But let's put that aside, say we want to implement your model where we
> do check with the guest and its device driver. How would that work,
> somehow error_detected() would have to wait for the guest to proceed
> into recovery and since the guest could just not do that we'd have to
> have some kind of timeout.

Right, a timeout seems reasonable.

> Also we can't allow the guest to choose
> PCI_ERS_RESULT_RECOVERED because otherwise we'd again be in the
> situation where recovery is completed without unblocking I/O.

The guest should only return that if the device has really recovered.
On an architecture which blocks I/O upon an error, by definition the
device cannot already be recovered in the ->error_detected() stage.

> And if we
> want to stick to the architecture QEMU/KVM will have to kind of have a
> mode where after being informed of ongoing recovery for a device they
> intercept attempts to reset / firmware calls for reset and turn that
> into the correct return. And somehow also deal with the timeout because
> e.g. old Linux guests won't do recovery but there is also no
> architected way for a guest to say that it does recovery.

I guess there are gaps in qemu with regards to error recovery,
but I think the solution is to add the missing functionality,
not try to work around the gaps.

Thanks,

Lukas