Re: [PATCH v15 4/7] s390/pci: Store PCI error information for passthrough devices
From: Farhan Ali
Date: Wed May 06 2026 - 13:20:52 EST
On 5/6/2026 2:38 AM, Niklas Schnelle wrote:
-static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)Sashiko notes that zdev->pendings_errs.mediated_recovery could become
+static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev,
+ struct zpci_ccdf_err *ccdf)
{
pci_ers_result_t ers_res = PCI_ERS_RESULT_DISCONNECT;
struct zpci_dev *zdev = to_zpci(pdev);
@@ -194,13 +206,6 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
}
pdev->error_state = pci_channel_io_frozen;
- if (is_passed_through(pdev)) {
- pr_info("%s: Cannot be recovered in the host because it is a pass-through device\n",
- pci_name(pdev));
- status_str = "failed (pass-through)";
- goto out_unlock;
- }
-
driver = to_pci_driver(pdev->dev.driver);
if (!is_driver_supported(driver)) {
if (!driver) {
@@ -216,12 +221,25 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
goto out_unlock;
}
+ zpci_store_pci_error(pdev, ccdf);
true between the above zpci_store_pci_error() and the below check for
leaving recovery to user-space. I think we could make a general
improvement that also tackles this concern. The ideas is that we could
have zpci_store_pci_error() return true if it did store the error and
we are in mediated recovery mode. Then we use that as the signal to
skip host recovery below. That way we also don't need to retake the
pending_errs_lock which makes the below much simpler and it would be a
win independent of the race. As for the race this would make sure that
we either do the host recovery or store the error and let user-space
recover.
I did think of the concern about mediated_recovery becoming true after zpci_store_pci_error(), but IIUC in that case we won't even be able to deliver the error signal to userspace (via error_detected()). And I don't think mediated_recovery flag can be set to true. Since we are holding the pci device lock, vfio_pci_core_enable() will fail as it will fail trying to reset the device.
Thanks
Farhan
ers_res = zpci_event_notify_error_detected(pdev, driver);
if (ers_result_indicates_abort(ers_res)) {
status_str = "failed (abort on detection)";
goto out_unlock;
}
+ mutex_lock(&zdev->pending_errs_lock);
+ if (zdev->pending_errs.mediated_recovery) {
+ pr_info("%s: Leaving recovery of pass-through device to user-space\n",
+ pci_name(pdev));
+ ers_res = PCI_ERS_RESULT_RECOVERED;
+ status_str = "in progress";
+ mutex_unlock(&zdev->pending_errs_lock);
+ goto out_unlock;
+ }
+ mutex_unlock(&zdev->pending_errs_lock);