Re: [PATCH v12 4/7] s390/pci: Store PCI error information for passthrough devices

From: Matthew Rosato

Date: Tue Mar 31 2026 - 14:01:05 EST


On 3/30/26 1:40 PM, Farhan Ali wrote:
> For a passthrough device we need co-operation from user space to recover
> the device. This would require to bubble up any error information to user
> space. Let's store this error information for passthrough devices, so it
> can be retrieved later.
>
> We can now have userspace drivers (vfio-pci based) on s390x. The userspace
> drivers will not have any KVM fd and so no kzdev associated with them. So
> we need to update the logic for detecting passthrough devices to not depend
> on struct kvm_zdev.
>
> Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>

Reviewed-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx>

But a suggestion below:

> ---
> arch/s390/include/asm/pci.h | 30 ++++++++
> arch/s390/pci/pci.c | 1 +
> arch/s390/pci/pci_event.c | 113 ++++++++++++++++++-------------
> drivers/vfio/pci/vfio_pci_zdev.c | 9 ++-
> 4 files changed, 105 insertions(+), 48 deletions(-)
>
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index c0ff19dab580..7fb9a80b0175 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -118,6 +118,31 @@ struct zpci_bus {
> enum pci_bus_speed max_bus_speed;
> };
>
> +/* Content Code Description for PCI Function Error */
> +struct zpci_ccdf_err {
> + u32 reserved1;
> + u32 fh; /* function handle */
> + u32 fid; /* function id */
> + u32 ett : 4; /* expected table type */
> + u32 mvn : 12; /* MSI vector number */
> + u32 dmaas : 8; /* DMA address space */
> + u32 reserved2 : 6;
> + u32 q : 1; /* event qualifier */
> + u32 rw : 1; /* read/write */
> + u64 faddr; /* failing address */
> + u32 reserved3;
> + u16 reserved4;
> + u16 pec; /* PCI event code */
> +} __packed;
> +
> +#define ZPCI_ERR_PENDING_MAX 4
> +struct zpci_ccdf_pending {
> + u8 count;
> + u8 head;
> + u8 tail;
> + struct zpci_ccdf_err err[ZPCI_ERR_PENDING_MAX];
> +};
> +
> /* Private data per function */
> struct zpci_dev {
> struct zpci_bus *zbus;
> @@ -171,6 +196,7 @@ struct zpci_dev {
>
> char res_name[16];
> bool mio_capable;
> + bool mediated_recovery;
> struct zpci_bar_struct bars[PCI_STD_NUM_BARS];
>
> u64 start_dma; /* Start of available DMA addresses */
> @@ -192,6 +218,8 @@ struct zpci_dev {
> struct iommu_domain *s390_domain; /* attached IOMMU domain */
> struct kvm_zdev *kzdev;
> struct mutex kzdev_lock;
> + struct zpci_ccdf_pending pending_errs;
> + struct mutex pending_errs_lock;

This mutex now protects the pending_errs struct (obvious) as well as the
mediated_recovery bool (not obvious). Add a comment here and/or as a
block before the needs/start/stop functions?

Actually.. Could mediated_recovery just be part of the
zpci_ccdf_pending stucture? AFAICT the bit basically controls whether
or not the zpci_ccdf_pending structure is used / has meaning. You even
turn the bit off at the same time you memset(&zdev->pending_errs, 0) in
zpci_stop_mediated_recovery() -- though an explicit setting of
mediated_devices = false would still be nice for code clarity.

Then the spinlock continues to protect only this structure.