Re: [RFC v2 12/20] dma-iommu: Implement NESTED_MSI cookie

From: Auger Eric
Date: Sat Oct 27 2018 - 05:25:06 EST


Hi Robin,

On 10/25/18 12:05 AM, Robin Murphy wrote:
> On 2018-10-24 7:44 pm, Auger Eric wrote:
>> Hi Robin,
>>
>> On 10/24/18 8:02 PM, Robin Murphy wrote:
>>> Hi Eric,
>>>
>>> On 2018-09-18 3:24 pm, Eric Auger wrote:
>>>> Up to now, when the type was UNMANAGED, we used to
>>>> allocate IOVA pages within a range provided by the user.
>>>> This does not work in nested mode.
>>>>
>>>> If both the host and the guest are exposed with SMMUs, each
>>>> would allocate an IOVA. The guest allocates an IOVA (gIOVA)
>>>> to map onto the guest MSI doorbell (gDB). The Host allocates
>>>> another IOVA (hIOVA) to map onto the physical doorbell (hDB).
>>>>
>>>> So we end up with 2 unrelated mappings, at S1 and S2:
>>>> ÂÂÂÂÂÂÂÂÂÂ S1ÂÂÂÂÂÂÂÂÂÂÂÂ S2
>>>> gIOVAÂÂÂ ->ÂÂÂÂ gDB
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ hIOVAÂÂÂ ->ÂÂÂ hDB
>>>>
>>>> The PCI device would be programmed with hIOVA.
>>>>
>>>> iommu_dma_bind_doorbell allows to pass gIOVA/gDB to the host
>>>> so that gIOVA can be used by the host instead of re-allocating
>>>> a new IOVA. That way the host can create the following nested
>>>> mapping:
>>>>
>>>> ÂÂÂÂÂÂÂÂÂÂ S1ÂÂÂÂÂÂÂÂÂÂ S2
>>>> gIOVAÂÂÂ ->ÂÂÂ gDBÂÂÂ ->ÂÂÂ hDB
>>>>
>>>> this time, the PCI device will be programmed with the gIOVA MSI
>>>> doorbell which is correctly map through the 2 stages.
>>>
>>> If I'm understanding things correctly, this plus a couple of the
>>> preceding patches all add up to a rather involved way of coercing an
>>> automatic allocator to only "allocate" predetermined addresses in an
>>> entirely known-ahead-of-time manner.
>> agreed
>> Â Given that the guy calling
>>> iommu_dma_bind_doorbell() could seemingly just as easily call
>>> iommu_map() at that point and not bother with an allocator cookie and
>>> all this machinery at all, what am I missing?
>> Well iommu_dma_map_msi_msg() gets called and is part of this existing
>> MSI mapping machinery. If we do not do anything this function allocates
>> an hIOVA that is not involved in any nested setup. So either we coerce
>> the allocator in place (which is what this series does) or we unplug the
>> allocator to replace this latter with a simple S2 mapping, as you
>> suggest, ie. iommu_map(gDB, hDB). Assuming we unplug the allocator, the
>> guy who actually calls iommu_dma_bind_doorbell() knows gDB but does not
>> know hDB. So I don't really get how we can simplify things.
>
> OK, there's what I was missing :D
>
> But that then seems to reveal a somewhat bigger problem - if the callers
> are simply registering IPAs, and relying on the ITS driver to grab an
> entry and fill in a PA later, then how does either one know *which* PA
> is supposed to belong to a given IPA in the case where you have multiple
> devices with different ITS targets assigned to the same guest?

You're definitively right here. I think this can be resolved by passing
the struct device handle along with the stage1 mapping and storing the
info together. Then when the host MSI controller looks for a free
unmapped iova, it must also check whether the device belongs to its MSI
domain.

(and if
> it's possible to assume a guest will use per-device stage 1 mappings and
> present it with a single vITS backed by multiple pITSes, I think things
> start breaking even harder.)

I don't really get your point here. Assigned devices on guest side
should be in separate iommu domain because we want them to get isolated
from each other. There is a single vITS as of now and I don't think we
will change that anytime soon. The vITS driver is allocating a gIOVA for
each separate domain and I currently "trap" the gIOVA/gPA mapping on
irqfd routing setup. This mapping gets associated to a VFIO IOMMU, one
per assigned device, so we have different vfio containers for each of
them. If I then enumerate all the devices attached to the containers and
pass this stage1 binding along with the device struct, I think we should
be OK?

Thanks

Eric

>
> Other than allowing arbitrary disjoint IOVA pages, I'm not sure this
> really works any differently from the existing MSI cookie now that I
> look more closely :/
>
> Robin.