RE: [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI support for the idxd driver

From: Tian, Kevin
Date: Thu Aug 13 2020 - 01:26:25 EST

> From: Jason Wang <jasowang@xxxxxxxxxx>
> Sent: Thursday, August 13, 2020 12:34 PM
> On 2020/8/12 下午12:05, Tian, Kevin wrote:
> >> The problem is that if we tie all controls via VFIO uAPI, the other
> >> subsystem like vDPA is likely to duplicate them. I wonder if there is a
> >> way to decouple the vSVA out of VFIO uAPI?
> > vSVA is a per-device (either pdev or mdev) feature thus naturally should
> > be managed by its device driver (VFIO or vDPA). From this angle some
> > duplication is inevitable given VFIO and vDPA are orthogonal passthrough
> > frameworks. Within the kernel the majority of vSVA handling is done by
> > IOMMU and IOASID modules thus most logic are shared.
> So why not introduce vSVA uAPI at IOMMU or IOASID layer?

One may ask a similar question why IOMMU doesn't expose map/unmap
as uAPI...

> >
> >>> If an userspace DMA interface can be easily
> >>> adapted to be a passthrough one, it might be the choice.
> >> It's not that easy even for VFIO which requires a lot of new uAPIs and
> >> infrastructures(e.g mdev) to be invented.
> >>
> >>
> >>> But for idxd,
> >>> we see mdev a much better fit here, given the big difference between
> >>> what userspace DMA requires and what guest driver requires in this hw.
> >> A weak point for mdev is that it can't serve kernel subsystem other than
> >> VFIO. In this case, you need some other infrastructures (like [1]) to do
> >> this.
> > mdev is not exclusive from kernel usages. It's perfectly fine for a driver
> > to reserve some work queues for host usages, while wrapping others
> > into mdevs.
> I meant you may want slices to be an independent device from the kernel
> point of view:
> E.g for ethernet devices, you may want 10K mdevs to be passed to guest.
> Similarly, you may want 10K net devices which is connected to the kernel
> networking subsystems.
> In this case it's not simply reserving queues but you need some other
> type of device abstraction. There could be some kind of duplication
> between this and mdev.

yes, some abstraction required but isn't it what the driver should
care about instead of mdev framework itself? If the driver reports
the same set of resource to both mdev and networking, it needs to
make sure when the resource is claimed in one interface then it
should be marked in-use in another. e.g. each mdev includes a
available_intances attribute. the driver could report 10k available
instances initially and then update it to 5K when another 5K is used
for net devices later.

Mdev definitely has its usage limitations. Some may be improved
in the future, some may not. But those are distracting from the
original purpose of this thread (mdev vs. userspace DMA) and better
be discussed in other places e.g. LPC...