Re: [PATCH v12 4/7] s390/pci: Store PCI error information for passthrough devices

From: Farhan Ali

Date: Tue Mar 31 2026 - 15:25:44 EST


Hey Matt,

On 3/31/2026 10:41 AM, Matthew Rosato wrote:
On 3/30/26 1:40 PM, Farhan Ali wrote:
For a passthrough device we need co-operation from user space to recover
the device. This would require to bubble up any error information to user
space. Let's store this error information for passthrough devices, so it
can be retrieved later.

We can now have userspace drivers (vfio-pci based) on s390x. The userspace
drivers will not have any KVM fd and so no kzdev associated with them. So
we need to update the logic for detecting passthrough devices to not depend
on struct kvm_zdev.

Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
Reviewed-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx>

Thanks for reviewing!


But a suggestion below:

---
arch/s390/include/asm/pci.h | 30 ++++++++
arch/s390/pci/pci.c | 1 +
arch/s390/pci/pci_event.c | 113 ++++++++++++++++++-------------
drivers/vfio/pci/vfio_pci_zdev.c | 9 ++-
4 files changed, 105 insertions(+), 48 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index c0ff19dab580..7fb9a80b0175 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -118,6 +118,31 @@ struct zpci_bus {
enum pci_bus_speed max_bus_speed;
};
+/* Content Code Description for PCI Function Error */
+struct zpci_ccdf_err {
+ u32 reserved1;
+ u32 fh; /* function handle */
+ u32 fid; /* function id */
+ u32 ett : 4; /* expected table type */
+ u32 mvn : 12; /* MSI vector number */
+ u32 dmaas : 8; /* DMA address space */
+ u32 reserved2 : 6;
+ u32 q : 1; /* event qualifier */
+ u32 rw : 1; /* read/write */
+ u64 faddr; /* failing address */
+ u32 reserved3;
+ u16 reserved4;
+ u16 pec; /* PCI event code */
+} __packed;
+
+#define ZPCI_ERR_PENDING_MAX 4
+struct zpci_ccdf_pending {
+ u8 count;
+ u8 head;
+ u8 tail;
+ struct zpci_ccdf_err err[ZPCI_ERR_PENDING_MAX];
+};
+
/* Private data per function */
struct zpci_dev {
struct zpci_bus *zbus;
@@ -171,6 +196,7 @@ struct zpci_dev {
char res_name[16];
bool mio_capable;
+ bool mediated_recovery;
struct zpci_bar_struct bars[PCI_STD_NUM_BARS];
u64 start_dma; /* Start of available DMA addresses */
@@ -192,6 +218,8 @@ struct zpci_dev {
struct iommu_domain *s390_domain; /* attached IOMMU domain */
struct kvm_zdev *kzdev;
struct mutex kzdev_lock;
+ struct zpci_ccdf_pending pending_errs;
+ struct mutex pending_errs_lock;
This mutex now protects the pending_errs struct (obvious) as well as the
mediated_recovery bool (not obvious). Add a comment here and/or as a
block before the needs/start/stop functions?

I can add a comment here. I was in two minds about renaming the lock, but at the end decided against to keep the diff minimum for the patches.



Actually.. Could mediated_recovery just be part of the
zpci_ccdf_pending stucture? AFAICT the bit basically controls whether
or not the zpci_ccdf_pending structure is used / has meaning. You even
turn the bit off at the same time you memset(&zdev->pending_errs, 0) in
zpci_stop_mediated_recovery() -- though an explicit setting of
mediated_devices = false would still be nice for code clarity.

Then the spinlock continues to protect only this structure.

You are right, the mediated_recovery flag drives the use of the zpci_ccdf_pending struct. IMHO keeping the flag helped me with code clarity and keeping the struct simple and not be tied with the flag. But if you (or anyone) feels strongly about it then I can add the flag to the zpci_ccdf_pending struct.

Thanks

Farhan