RE: [PATCH v2 1/3] iommu/uapi: Define uapi version and capabilities

From: Tian, Kevin
Date: Tue Mar 31 2020 - 02:06:47 EST


> From: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> Sent: Tuesday, March 31, 2020 12:08 AM
>
> On Mon, 30 Mar 2020 05:40:40 +0000
> "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>
> > > From: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > > Sent: Saturday, March 28, 2020 7:54 AM
> > >
> > > On Fri, 27 Mar 2020 00:47:02 -0700
> > > Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> > >
> > > > On Fri, Mar 27, 2020 at 02:49:55AM +0000, Tian, Kevin wrote:
> > > > > If those API calls are inter-dependent for composing a feature
> > > > > (e.g. SVA), shouldn't we need a way to check them together
> > > > > before exposing the feature to the guest, e.g. through a
> > > > > iommu_get_uapi_capabilities interface?
> > > >
> > > > Yes, that makes sense. The important bit is to have a capability
> > > > flags and not version numbers.
> > >
> > > The challenge is that there are two consumers in the kernel for
> > > this. 1. VFIO only look for compatibility, and size of each data
> > > struct such that it can copy_from_user.
> > >
> > > 2. IOMMU driver, the "real consumer" of the content.
> > >
> > > For 2, I agree and we do plan to use the capability flags to check
> > > content and maintain backward compatibility etc.
> > >
> > > For VFIO, it is difficult to do size look up based on capability
> > > flags.
> >
> > Can you elaborate the difficulty in VFIO? if, as Christoph Hellwig
> > pointed out, version number is already avoided everywhere, it is
> > interesting to know whether this work becomes a real exception
> > or just requires a different mindset.
> >
> From VFIO p.o.v. the IOMMU UAPI data is opaque, it only needs to do two
> things:
> 1. is the UAPI compatible?
> 2. what is the size to copy?
>
> If you look at the version number, this is really a "version as size"
> lookup, as provided by the helper function in this patch. An example
> can be the newly introduced clone3 syscall.
> https://lwn.net/Articles/792628/
> In clone3, new version must have new size. The slight difference here
> is that, unlike clone3, we have multiple data structures instead of a
> single struct clone_args {}. And each struct has flags to enumerate its
> contents besides size.

Thanks for providing that link. However clone3 doesn't include a version
field to do "version as size" lookup. Instead, as you said, it includes a
size parameter which sounds like the option 3 (argsz) listed below.

>
> Besides breaching data abstraction, if VFIO has to check IOMMU flags to
> determine the sizes, it has many combinations.
>
> We also separate the responsibilities into two parts
> 1. compatibility - version, size by VFIO
> 2. sanity check - capability flags - by IOMMU

I feel argsz+flags approach can perfectly meet above requirement. The
userspace set the size and flags for whatever capabilities it uses, and
VFIO simply copies the parameters by size and pass to IOMMU for
further sanity check. Of course the assumption is that we do provide
an interface for userspace to enumerate all supported capabilities.

Is there anything that I overlooked here? I suppose there might be
some difficulties that block you from going the argsz way...

Thanks
Kevin

>
> I think the latter matches what Christoph's comments. So we are in
> agreement at the IOMMU level :)
>
> For example:
> During guest PASID bind, IOMMU driver operates on the data passed from
> VFIO and check format & flags to take different code path.
>
> #define IOMMU_PASID_FORMAT_INTEL_VTD 1
> __u32 format;
> #define IOMMU_SVA_GPASID_VAL (1 << 0) /* guest PASID valid */
> __u64 flags;
>
> Jacob
>
> > btw the most relevant discussion which I can find out now is here:
> > https://lkml.org/lkml/2020/2/3/1126
> >
> > It mentioned 3 options for handling extension:
> > --
> > 1. Disallow adding new members to each structure other than reuse
> > padding bits or adding union members at the end.
> > 2. Allow extension of the structures beyond union, but union size has
> > to be fixed with reserved spaces
> > 3. Adopt VFIO argsz scheme, I don't think we need version for each
> > struct anymore. argsz implies the version that user is using assuming
> > UAPI data is extension only.
> > --
> >
> > the first two are both version-based. Looks most guys agreed with
> > option-1 (in this v2), but Alex didn't give his opinion at the
> > moment. The last response from him was the raise of option-3 using
> > argsz to avoid version. So, we also need hear from him. Alex?
> >
> > Thanks
> > Kevin
>
> [Jacob Pan]