RE: [RFC v2] /dev/iommu uAPI proposal

From: Tian, Kevin
Date: Thu Jul 15 2021 - 02:50:01 EST


> From: Shenming Lu <lushenming@xxxxxxxxxx>
> Sent: Thursday, July 15, 2021 2:29 PM
>
> On 2021/7/15 11:55, Tian, Kevin wrote:
> >> From: Shenming Lu <lushenming@xxxxxxxxxx>
> >> Sent: Thursday, July 15, 2021 11:21 AM
> >>
> >> On 2021/7/9 15:48, Tian, Kevin wrote:
> >>> 4.6. I/O page fault
> >>> +++++++++++++++++++
> >>>
> >>> uAPI is TBD. Here is just about the high-level flow from host IOMMU
> driver
> >>> to guest IOMMU driver and backwards. This flow assumes that I/O page
> >> faults
> >>> are reported via IOMMU interrupts. Some devices report faults via
> device
> >>> specific way instead of going through the IOMMU. That usage is not
> >> covered
> >>> here:
> >>>
> >>> - Host IOMMU driver receives a I/O page fault with raw fault_data {rid,
> >>> pasid, addr};
> >>>
> >>> - Host IOMMU driver identifies the faulting I/O page table according to
> >>> {rid, pasid} and calls the corresponding fault handler with an opaque
> >>> object (registered by the handler) and raw fault_data (rid, pasid, addr);
> >>>
> >>> - IOASID fault handler identifies the corresponding ioasid and device
> >>> cookie according to the opaque object, generates an user fault_data
> >>> (ioasid, cookie, addr) in the fault region, and triggers eventfd to
> >>> userspace;
> >>>
> >>
> >> Hi, I have some doubts here:
> >>
> >> For mdev, it seems that the rid in the raw fault_data is the parent device's,
> >> then in the vSVA scenario, how can we get to know the mdev(cookie) from
> >> the
> >> rid and pasid?
> >>
> >> And from this point of view,would it be better to register the mdev
> >> (iommu_register_device()) with the parent device info?
> >>
> >
> > This is what is proposed in this RFC. A successful binding generates a new
> > iommu_dev object for each vfio device. For mdev this object includes
> > its parent device, the defPASID marking this mdev, and the cookie
> > representing it in userspace. Later it is iommu_dev being recorded in
> > the attaching_data when the mdev is attached to an IOASID:
> >
> > struct iommu_attach_data *__iommu_device_attach(
> > struct iommu_dev *dev, u32 ioasid, u32 pasid, int flags);
> >
> > Then when a fault is reported, the fault handler just needs to figure out
> > iommu_dev according to {rid, pasid} in the raw fault data.
> >
>
> Yeah, we have the defPASID that marks the mdev and refers to the default
> I/O address space, but how about the non-default I/O address spaces?
> Is there a case that two different mdevs (on the same parent device)
> are used by the same process in the guest, thus have a same pasid route
> in the physical IOMMU? It seems that we can't figure out the mdev from
> the rid and pasid in this case...
>
> Did I misunderstand something?... :-)
>

No. You are right on this case. I don't think there is a way to
differentiate one mdev from the other if they come from the
same parent and attached by the same guest process. In this
case the fault could be reported on either mdev (e.g. the first
matching one) to get it fixed in the guest.

Thanks
Kevin