RE: [PATCH v2 1/3] docs: IOMMU user API

From: Liu, Yi L
Date: Thu Jun 18 2020 - 23:30:31 EST


Hi Alex,

> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Friday, June 19, 2020 10:55 AM
>
> On Fri, 19 Jun 2020 02:15:36 +0000
> "Liu, Yi L" <yi.l.liu@xxxxxxxxx> wrote:
>
> > Hi Alex,
> >
> > > From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > > Sent: Friday, June 19, 2020 5:48 AM
> > >
> > > On Wed, 17 Jun 2020 08:28:24 +0000
> > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> > >
> > > > > From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> > > > > Sent: Wednesday, June 17, 2020 2:20 PM
> > > > >
> > > > > > From: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > > > > > Sent: Tuesday, June 16, 2020 11:22 PM
> > > > > >
> > > > > > On Thu, 11 Jun 2020 17:27:27 -0700 Jacob Pan
> > > > > > <jacob.jun.pan@xxxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > > >
> > > > > > > > But then I thought it even better if VFIO leaves the
> > > > > > > > entire
> > > > > > > > copy_from_user() to the layer consuming it.
> > > > > > > >
> > > > > > > OK. Sounds good, that was what Kevin suggested also. I just
> > > > > > > wasn't sure how much VFIO wants to inspect, I thought VFIO
> > > > > > > layer wanted to do a sanity check.
> > > > > > >
> > > > > > > Anyway, I will move copy_from_user to iommu uapi layer.
> > > > > >
> > > > > > Just one more point brought up by Yi when we discuss this offline.
> > > > > >
> > > > > > If we move copy_from_user to iommu uapi layer, then there will
> > > > > > be
> > > > > multiple
> > > > > > copy_from_user calls for the same data when a VFIO container
> > > > > > has
> > > > > multiple domains,
> > > > > > devices. For bind, it might be OK. But might be additional
> > > > > > overhead for TLB
> > > > > flush
> > > > > > request from the guest.
> > > > >
> > > > > I think it is the same with bind and TLB flush path. will be
> > > > > multiple copy_from_user.
> > > >
> > > > multiple copies is possibly fine. In reality we allow only one
> > > > group per nesting container (as described in patch [03/15]), and
> > > > usually there is just one SVA-capable device per group.
> > > >
> > > > >
> > > > > BTW. for moving data copy to iommy layer, there is another point
> > > > > which need to consider. VFIO needs to do unbind in bind path if
> > > > > bind failed, so it will assemble unbind_data and pass to iommu
> > > > > layer. If iommu layer do the copy_from_user, I think it will be failed. any
> idea?
> > >
> > > If a call into a UAPI fails, there should be nothing to undo.
> > > Creating a partial setup for a failed call that needs to be undone
> > > by the caller is not good practice.
> >
> > is it still a problem if it's the VFIO to undo the partial setup
> > before returning to user space?
>
> Yes. If a UAPI function fails there should be no residual effect.

ok. the iommu_sva_bind_gpasid() is per device call. There is no residual
effect if it failed. so no partial setup will happen per device.

but VFIO needs to use iommu_group_for_each_dev() to do bind, so
if iommu_group_for_each_dev() failed, I guess VFIO needs to undo
the partial setup for the group. right?

> > > > This might be mitigated if we go back to use the same bind_data
> > > > for both bind/unbind. Then you can reuse the user object for unwinding.
> > > >
> > > > However there is another case where VFIO may need to assemble the
> > > > bind_data itself. When a VM is killed, VFIO needs to walk
> > > > allocated PASIDs and unbind them one-by-one. In such case
> > > > copy_from_user doesn't work since the data is created by kernel.
> > > > Alex, do you have a suggestion how this usage can be supported?
> > > > e.g. asking IOMMU driver to provide two sets of APIs to handle user/kernel
> generated requests?
> > >
> > > Yes, it seems like vfio would need to make use of a driver API to do
> > > this, we shouldn't be faking a user buffer in the kernel in order to
> > > call through to a UAPI. Thanks,
> >
> > ok, so if VFIO wants to issue unbind by itself, it should use an API
> > which passes kernel buffer to IOMMU layer. If the unbind request is
> > from user space, then VFIO should use another API which passes user
> > buffer pointer to IOMMU layer. makes sense. will align with jacob.
>
> Sounds right to me. Different approaches might be used for the driver API versus
> the UAPI, perhaps there is no buffer. Thanks,

thanks for your coaching. It may require Jacob to add APIs in iommu layer
for the two purposes.

Regards,
Yi Liu

> Alex