Re: [PATCH v13 5/7] vfio-pci/zdev: Add a device feature for error information

From: Farhan Ali

Date: Tue Apr 14 2026 - 14:46:09 EST



On 4/14/2026 10:41 AM, Alex Williamson wrote:
On Tue, 14 Apr 2026 10:13:22 -0700
Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
On 4/14/2026 7:12 AM, Alex Williamson wrote:
On Mon, 13 Apr 2026 16:40:49 -0700
Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
On 4/13/2026 3:57 PM, Alex Williamson wrote:
On Mon, 13 Apr 2026 14:06:06 -0700
Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 5de618a3a5ee..2980ca39dd38 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1534,6 +1534,26 @@ struct vfio_device_feature_dma_buf {
*/
#define VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2 12
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_ERROR feature provides PCI error information to
+ * userspace for vfio-pci devices on s390x. On s390x, PCI error recovery
+ * involves platform firmware and notification to operating system is done
+ * by architecture specific mechanism. Exposing this information to
+ * userspace allows it to take appropriate actions to handle an
+ * error on the device. The pending_errors provide any additional errors
+ * pending for the device, and userspace should read until zero. A value of
+ * 0 for pending_errors and pec would indicate no pending errors that need
+ * to be handled.
+ */
+
+struct vfio_device_feature_zpci_err {
+ __u8 version;
+ __u8 pending_errors;
+ __u16 pec;
+};
I assume .version is for compatibility, but we don't define a strategy
for using it or specify what the version should be for this table. It
doesn't seem like there's actually an value-add to having it.
Its possible we may need to extend this structure in the future if we
want to report more information to userspace. I at least want the
flexibility to do so. We had some discussion around this [1] in an
earlier version. I was trying to follow similar versioning pattern we
had around vfio-pci/zdev structures.
IMHO, the version field is a dead end towards achieving this,
especially if we don't specify from the onset the expected version
value or the compatibility semantics. All that's going to happen is
that some userspace will hard code that it understands version 1
because that's what's currently reported and matches the struct defined
here, and you can never ever report anything other than version 1
without breaking that user. At that point you need to come up with
some other means for the user to opt-in to a new version, whether it's
triggered by another feature (as we did with the PRECOPY_INFOv2 above
this), or we reimplement the whole v2 feature.
My understanding was based on how we version some of the capability
structures for zdev (in include/uapi/linux/vfio_zdev.h). If we wanted to
provide more information to userspace in the future, what would be
preferred approach? Do we need to explicitly define a v2 feature? I am
open to suggestions on this.

If we need to define v2 explicitly in the future, then yes I agree we
can simplify it to return only the PEC code (or an error code otherwise).
Maybe I'm reading too much into it, but it sounds like you already have
plans to expand this.

The existing zdev structures seem to rely on the version field in the
capability header and maybe you've gotten away with bumping the version
without breaking userspace, but it's fragile. AFAICT, the ioctls do not
define a versioning strategy where vN+1 only adds fields to vN, so all
it takes is one outspoken userspace tool that hard codes its
compatibility to a specific version to become a problem for further
updates.

Likewise this feature doesn't define a versioning policy for userspace
to follow. I'm sure there are other examples within vfio that are
problematic, but let's try not to create more.

If you want to use a version, then also define what the version is and
what the compatibility policy is for future versions. A flags field is
another option that we've used extensively in vfio. The version field
might be better for incremental expansion of the structure, while flags
can address specific fields more directly, ex. a reserved field being
redefined. Thanks,

Alex

Just wanted to understand and clarify if we can associate multiple fields with a feature flag? I think in that case flags would be better here and given its also something that is widely used in vfio. I am just trying to figure out the best way we can extend this without a lot of code churn in the future.

I appreciate the feedback and discussion on this.

Thanks

Farhan