RE: [PATCH v2 1/3] docs: IOMMU user API

From: Liu, Yi L
Date: Thu Jun 18 2020 - 22:15:53 EST


Hi Alex,

> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Friday, June 19, 2020 5:48 AM
>
> On Wed, 17 Jun 2020 08:28:24 +0000
> "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>
> > > From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> > > Sent: Wednesday, June 17, 2020 2:20 PM
> > >
> > > > From: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > > > Sent: Tuesday, June 16, 2020 11:22 PM
> > > >
> > > > On Thu, 11 Jun 2020 17:27:27 -0700
> > > > Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx> wrote:
> > > >
> > > > > >
> > > > > > But then I thought it even better if VFIO leaves the entire
> > > > > > copy_from_user() to the layer consuming it.
> > > > > >
> > > > > OK. Sounds good, that was what Kevin suggested also. I just wasn't
> > > > > sure how much VFIO wants to inspect, I thought VFIO layer wanted to do
> > > > > a sanity check.
> > > > >
> > > > > Anyway, I will move copy_from_user to iommu uapi layer.
> > > >
> > > > Just one more point brought up by Yi when we discuss this offline.
> > > >
> > > > If we move copy_from_user to iommu uapi layer, then there will be
> > > multiple
> > > > copy_from_user calls for the same data when a VFIO container has
> > > multiple domains,
> > > > devices. For bind, it might be OK. But might be additional overhead for TLB
> > > flush
> > > > request from the guest.
> > >
> > > I think it is the same with bind and TLB flush path. will be multiple
> > > copy_from_user.
> >
> > multiple copies is possibly fine. In reality we allow only one group per
> > nesting container (as described in patch [03/15]), and usually there
> > is just one SVA-capable device per group.
> >
> > >
> > > BTW. for moving data copy to iommy layer, there is another point which
> > > need to consider. VFIO needs to do unbind in bind path if bind failed,
> > > so it will assemble unbind_data and pass to iommu layer. If iommu layer
> > > do the copy_from_user, I think it will be failed. any idea?
>
> If a call into a UAPI fails, there should be nothing to undo. Creating
> a partial setup for a failed call that needs to be undone by the caller
> is not good practice.

is it still a problem if it's the VFIO to undo the partial setup before
returning to user space?

> > This might be mitigated if we go back to use the same bind_data for both
> > bind/unbind. Then you can reuse the user object for unwinding.
> >
> > However there is another case where VFIO may need to assemble the
> > bind_data itself. When a VM is killed, VFIO needs to walk allocated PASIDs
> > and unbind them one-by-one. In such case copy_from_user doesn't work
> > since the data is created by kernel. Alex, do you have a suggestion how this
> > usage can be supported? e.g. asking IOMMU driver to provide two sets of
> > APIs to handle user/kernel generated requests?
>
> Yes, it seems like vfio would need to make use of a driver API to do
> this, we shouldn't be faking a user buffer in the kernel in order to
> call through to a UAPI. Thanks,

ok, so if VFIO wants to issue unbind by itself, it should use an API which
passes kernel buffer to IOMMU layer. If the unbind request is from user
space, then VFIO should use another API which passes user buffer pointer
to IOMMU layer. makes sense. will align with jacob.

Regards,
Yi Liu

> Alex