Re: [PATCH v2 1/3] iommu/uapi: Define uapi version and capabilities
From: Jacob Pan
Date: Thu Apr 02 2020 - 14:30:26 EST
On Wed, 1 Apr 2020 05:32:21 +0000
"Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> > From: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > Sent: Tuesday, March 31, 2020 11:55 PM
> >
> > On Tue, 31 Mar 2020 06:06:38 +0000
> > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> >
> > > > From: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > > > Sent: Tuesday, March 31, 2020 12:08 AM
> > > >
> > > > On Mon, 30 Mar 2020 05:40:40 +0000
> > > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> > > >
> > > > > > From: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > > > > > Sent: Saturday, March 28, 2020 7:54 AM
> > > > > >
> > > > > > On Fri, 27 Mar 2020 00:47:02 -0700
> > > > > > Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > > On Fri, Mar 27, 2020 at 02:49:55AM +0000, Tian, Kevin
> > > > > > > wrote:
> > > > > > > > If those API calls are inter-dependent for composing a
> > > > > > > > feature (e.g. SVA), shouldn't we need a way to check
> > > > > > > > them together before exposing the feature to the guest,
> > > > > > > > e.g. through a iommu_get_uapi_capabilities interface?
> > > > > > >
> > > > > > > Yes, that makes sense. The important bit is to have a
> > > > > > > capability flags and not version numbers.
> > > > > >
> > > > > > The challenge is that there are two consumers in the kernel
> > > > > > for this. 1. VFIO only look for compatibility, and size of
> > > > > > each data struct such that it can copy_from_user.
> > > > > >
> > > > > > 2. IOMMU driver, the "real consumer" of the content.
> > > > > >
> > > > > > For 2, I agree and we do plan to use the capability flags to
> > > > > > check content and maintain backward compatibility etc.
> > > > > >
> > > > > > For VFIO, it is difficult to do size look up based on
> > > > > > capability flags.
> > > > >
> > > > > Can you elaborate the difficulty in VFIO? if, as Christoph
> > > > > Hellwig pointed out, version number is already avoided
> > > > > everywhere, it is interesting to know whether this work
> > > > > becomes a real exception or just requires a different mindset.
> > > > >
> > > > From VFIO p.o.v. the IOMMU UAPI data is opaque, it only needs
> > > > to do two things:
> > > > 1. is the UAPI compatible?
> > > > 2. what is the size to copy?
> > > >
> > > > If you look at the version number, this is really a "version as
> > > > size" lookup, as provided by the helper function in this patch.
> > > > An example can be the newly introduced clone3 syscall.
> > > > https://lwn.net/Articles/792628/
> > > > In clone3, new version must have new size. The slight difference
> > > > here is that, unlike clone3, we have multiple data structures
> > > > instead of a single struct clone_args {}. And each struct has
> > > > flags to enumerate its contents besides size.
> > >
> > > Thanks for providing that link. However clone3 doesn't include a
> > > version field to do "version as size" lookup. Instead, as you
> > > said, it includes a size parameter which sounds like the option 3
> > > (argsz) listed below.
> > >
> > Right, there is no version in clone3. size = version. I view this as
> > a 1:1 lookup.
> >
> > > >
> > > > Besides breaching data abstraction, if VFIO has to check IOMMU
> > > > flags to determine the sizes, it has many combinations.
> > > >
> > > > We also separate the responsibilities into two parts
> > > > 1. compatibility - version, size by VFIO
> > > > 2. sanity check - capability flags - by IOMMU
> > >
> > > I feel argsz+flags approach can perfectly meet above requirement.
> > > The userspace set the size and flags for whatever capabilities it
> > > uses, and VFIO simply copies the parameters by size and pass to
> > > IOMMU for further sanity check. Of course the assumption is that
> > > we do provide an interface for userspace to enumerate all
> > > supported capabilities.
> > You cannot trust user for argsz. the size to be copied from user
> > must be based on knowledge in kernel. That is why we have this
> > version to size lookup.
> >
> > In VFIO, the size to copy is based on knowledge of each VFIO UAPI
> > structures and VFIO flags. But here the flags are IOMMU UAPI flags.
> > As you pointed out in another thread, VFIO is one user.
>
> If that is the case, can we let VFIO only copy its own UAPI fields
> while simply passing the user pointer of IOMMU UAPI structure to IOMMU
> driver for further size check and copy? Otherwise we are entering a
> dead end that VFIO doesn't want to parse a structure which is not
> defined by him while using version to represent the black box size
> is considered as a discarded scheme and doesn't scale well...
>
I think this could be an other viable option. Let me try to summarize
since this has been a long discussion since the original version.
Problem statements:
1. When launching vIOMMU in the guest, how can we ensure the host has
compatible support upfront? as compared to fail later.
2. As UAPI data gets extended (both in size and flags), how can we know
the size to copy
3. Maintain backward compatibility while allowing extensions?
I think we all agreed that using flags (capability or types) is the way
to address #3. As Christoph pointed out, version number should not be
used for this purpose.
So for problem 1 & 2, we have the following options:
1. Have a version-size mapping as proposed in this set. VFIO copies from
user the correct size based on version-type lookup. Processing of the
data is based on flags in IOMMU driver.
2. VFIO copy its own minsz then pass the user pointer to IOMMU driver
for further copy_from_user based on flags. (by Kevin)
3. Adopt VFIO argsz scheme, caller fills in argsz for the offset the
variable size union. VFIO do not check argsz in that it requires IOMMU
specific knowledge. IOMMU driver Use flags to handle the variable
size.(by Alex). I think this what we have in Yi's VFIO & QEMU patch.
argsz filled by QEMU includes bind_data.
4. Do not use a unified version, have a fixed size of all UAPI
structures, padding in struct and union. (Wasteful, not preferred per
V1 discussion)
For both 2 & 3, a unified version is not used, each API
treated separately. vIOMMU will be launched w/o assurance of
compatibility of all APIs. Fault handling may be more complex in normal
operations.
Appreciate everyone's input. Joerg and Alex, could you help to make a
decision here?
Thanks,
Jacob
> >
> [...]
> [...]
> [...]
> > > >
> > > > [Jacob Pan]
> >
> > [Jacob Pan]
>
> Thanks
> Kevin
[Jacob Pan]