Re: [PATCH 1/3] vfio: Introduce vma ops registration and notifier

From: Alex Williamson
Date: Thu Feb 18 2021 - 16:58:12 EST


On Wed, 17 Feb 2021 21:12:09 -0400
Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:

> On Fri, Feb 12, 2021 at 05:20:57PM -0400, Jason Gunthorpe wrote:
> > On Fri, Feb 12, 2021 at 12:27:39PM -0700, Alex Williamson wrote:
> > > Create an interface through vfio-core where a vfio bus driver (ex.
> > > vfio-pci) can register the vm_operations_struct it uses to map device
> > > memory, along with a set of registration callbacks. This allows
> > > vfio-core to expose interfaces for IOMMU backends to match a
> > > vm_area_struct to a bus driver and register a notifier for relavant
> > > changes to the device mapping. For now we define only a notifier
> > > action for closing the device.
> > >
> > > Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > > drivers/vfio/vfio.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > > include/linux/vfio.h | 20 ++++++++
> > > 2 files changed, 140 insertions(+)
> > >
> > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > > index 38779e6fd80c..568f5e37a95f 100644
> > > +++ b/drivers/vfio/vfio.c
> > > @@ -47,6 +47,8 @@ static struct vfio {
> > > struct cdev group_cdev;
> > > dev_t group_devt;
> > > wait_queue_head_t release_q;
> > > + struct list_head vm_ops_list;
> > > + struct mutex vm_ops_lock;
> > > } vfio;
> > >
> > > struct vfio_iommu_driver {
> > > @@ -2354,6 +2356,121 @@ struct iommu_domain *vfio_group_iommu_domain(struct vfio_group *group)
> > > }
> > > EXPORT_SYMBOL_GPL(vfio_group_iommu_domain);
> > >
> > > +struct vfio_vma_ops {
> > > + const struct vm_operations_struct *vm_ops;
> > > + vfio_register_vma_nb_t *reg_fn;
> > > + vfio_unregister_vma_nb_t *unreg_fn;
> > > + struct list_head next;
> > > +};
> > > +
> > > +int vfio_register_vma_ops(const struct vm_operations_struct *vm_ops,
> > > + vfio_register_vma_nb_t *reg_fn,
> > > + vfio_unregister_vma_nb_t *unreg_fn)
> >
> > This just feels a little bit too complicated
> >
> > I've recently learned from Daniel that we can use the address_space
> > machinery to drive the zap_vma_ptes() via unmap_mapping_range(). This
> > technique replaces all the open, close and vma_list logic in vfio_pci
>
> Here is my effort to make rdma use this, it removes a lot of ugly code:
>
> https://github.com/jgunthorpe/linux/commits/rdma_addr_space
>
> Still needs some more detailed testing.
>
> This gives an option to detect vfio VMAs by checking
>
> if (vma->vm_file &&
> file_inode(vma->vm_file) &&
> file_inode(vma->vm_file)->i_sb->s_type == vfio_fs_type)
>
> And all vfio VMA's can have some consistent vm_private_data, or at
> worst a consistent extended vm operations struct.

Looks pretty slick. I won't claim it's fully gelled in my head yet,
but AIUI you're creating these inodes on your new pseudo fs and
associating it via the actual user fd via the f_mapping pointer, which
allows multiple fds to associate and address space back to this inode
when you want to call unmap_mapping_range(). That clarifies from the
previous email how we'd store the inode on the vfio_device without
introducing yet another tracking list for device fds. I'll try to
piece together something similar for vfio, especially if we can avoid
that nasty lock switcheroo we copied from
uverbs_user_mmap_disassociate(). Thanks,

Alex