Re: [RFC v2] /dev/iommu uAPI proposal

From: Shenming Lu
Date: Thu Jul 15 2021 - 04:14:20 EST

On 2021/7/15 14:49, Tian, Kevin wrote:
>> From: Shenming Lu <lushenming@xxxxxxxxxx>
>> Sent: Thursday, July 15, 2021 2:29 PM
>> On 2021/7/15 11:55, Tian, Kevin wrote:
>>>> From: Shenming Lu <lushenming@xxxxxxxxxx>
>>>> Sent: Thursday, July 15, 2021 11:21 AM
>>>> On 2021/7/9 15:48, Tian, Kevin wrote:
>>>>> 4.6. I/O page fault
>>>>> +++++++++++++++++++
>>>>> uAPI is TBD. Here is just about the high-level flow from host IOMMU
>> driver
>>>>> to guest IOMMU driver and backwards. This flow assumes that I/O page
>>>> faults
>>>>> are reported via IOMMU interrupts. Some devices report faults via
>> device
>>>>> specific way instead of going through the IOMMU. That usage is not
>>>> covered
>>>>> here:
>>>>> - Host IOMMU driver receives a I/O page fault with raw fault_data {rid,
>>>>> pasid, addr};
>>>>> - Host IOMMU driver identifies the faulting I/O page table according to
>>>>> {rid, pasid} and calls the corresponding fault handler with an opaque
>>>>> object (registered by the handler) and raw fault_data (rid, pasid, addr);
>>>>> - IOASID fault handler identifies the corresponding ioasid and device
>>>>> cookie according to the opaque object, generates an user fault_data
>>>>> (ioasid, cookie, addr) in the fault region, and triggers eventfd to
>>>>> userspace;
>>>> Hi, I have some doubts here:
>>>> For mdev, it seems that the rid in the raw fault_data is the parent device's,
>>>> then in the vSVA scenario, how can we get to know the mdev(cookie) from
>>>> the
>>>> rid and pasid?
>>>> And from this point of view,would it be better to register the mdev
>>>> (iommu_register_device()) with the parent device info?
>>> This is what is proposed in this RFC. A successful binding generates a new
>>> iommu_dev object for each vfio device. For mdev this object includes
>>> its parent device, the defPASID marking this mdev, and the cookie
>>> representing it in userspace. Later it is iommu_dev being recorded in
>>> the attaching_data when the mdev is attached to an IOASID:
>>> struct iommu_attach_data *__iommu_device_attach(
>>> struct iommu_dev *dev, u32 ioasid, u32 pasid, int flags);
>>> Then when a fault is reported, the fault handler just needs to figure out
>>> iommu_dev according to {rid, pasid} in the raw fault data.
>> Yeah, we have the defPASID that marks the mdev and refers to the default
>> I/O address space, but how about the non-default I/O address spaces?
>> Is there a case that two different mdevs (on the same parent device)
>> are used by the same process in the guest, thus have a same pasid route
>> in the physical IOMMU? It seems that we can't figure out the mdev from
>> the rid and pasid in this case...
>> Did I misunderstand something?... :-)
> No. You are right on this case. I don't think there is a way to
> differentiate one mdev from the other if they come from the
> same parent and attached by the same guest process. In this
> case the fault could be reported on either mdev (e.g. the first
> matching one) to get it fixed in the guest.

OK. Thanks,