Re: [PATCH v3 02/16] iommu: Introduce cache_invalidate API

From: Jean-Philippe Brucker
Date: Wed May 15 2019 - 11:27:19 EST

On 15/05/2019 15:47, Tian, Kevin wrote:
>> From: Jean-Philippe Brucker
>> Sent: Wednesday, May 15, 2019 7:04 PM
>> On 14/05/2019 18:44, Jacob Pan wrote:
>>> Hi Thank you both for the explanation.
>>> On Tue, 14 May 2019 11:41:24 +0100
>>> Jean-Philippe Brucker <jean-philippe.brucker@xxxxxxx> wrote:
>>>> On 14/05/2019 08:36, Auger Eric wrote:
>>>>> Hi Jacob,
>>>>> On 5/14/19 12:16 AM, Jacob Pan wrote:
>>>>>> On Mon, 13 May 2019 18:09:48 +0100
>>>>>> Jean-Philippe Brucker <jean-philippe.brucker@xxxxxxx> wrote:
>>>>>>> On 13/05/2019 17:50, Auger Eric wrote:
>>>>>>>>> struct iommu_inv_pasid_info {
>>>>>>>>> #define IOMMU_INV_PASID_FLAGS_PASID (1 << 0)
>>>>>>>>> #define IOMMU_INV_PASID_FLAGS_ARCHID (1 << 1)
>>>>>>>>> __u32 flags;
>>>>>>>>> __u32 archid;
>>>>>>>>> __u64 pasid;
>>>>>>>>> };
>>>>>>>> I agree it does the job now. However it looks a bit strange to
>>>>>>>> do a PASID based invalidation in my case - SMMUv3 nested stage -
>>>>>>>> where I don't have any PASID involved.
>>>>>>>> Couldn't we call it context based invalidation then? A context
>>>>>>>> can be tagged by a PASID or/and an ARCHID.
>>>>>>> I think calling it "context" would be confusing as well (I
>>>>>>> shouldn't have used it earlier), since VT-d uses that name for
>>>>>>> device table entries (=STE on Arm SMMU). Maybe "addr_space"?
>>>>>> I am still struggling to understand what ARCHID is after scanning
>>>>>> through SMMUv3.1 spec. It seems to be a constant for a given SMMU.
>>>>>> Why do you need to pass it down every time? Could you point to me
>>>>>> the document or explain a little more on ARCHID use cases.
>>>>>> We have three fileds called pasid under this struct
>>>>>> iommu_cache_invalidate_info{}
>>>>>> Gets confusing :)
>>>>> archid is a generic term. That's why you did not find it in the
>>>>> spec ;-)
>>>>> On ARM SMMU the archid is called the ASID (Address Space ID, up to
>>>>> 16 bits. The ASID is stored in the Context Descriptor Entry (your
>>>>> PASID entry) and thus characterizes a given stage 1 translation
>>>>> "context"/"adress space".
>>>> Yes, another way to look at it is, for a given address space:
>>>> * PASID tags device-IOTLB (ATC) entries.
>>>> * ASID (here called archid) tags IOTLB entries.
>>>> They could have the same value, but it depends on the guest's
>>>> allocation policy which isn't in our control. With my PASID patches
>>>> for SMMUv3, they have different values. So we need both fields if we
>>>> intend to invalidate both ATC and IOTLB with a single call.
>>> For ASID invalidation, there is also page/address selective within an
>>> ASID, right? I guess it is CMD_TLBI_NH_VA?
>>> So the single call to invalidate both ATC & IOTLB should share the same
>>> address information. i.e.
>>> struct iommu_inv_addr_info {}
>>> Just out of curiosity, what is the advantage of having guest tag its
>>> ATC with its own PASID? I thought you were planning to use custom
>>> ioasid allocator to get PASID from host.
>> Hm, for the moment I mostly considered the custom ioasid allocator for
>> Intel platforms. On Arm platforms the SR-IOV model where each VM has its
>> own PASID space is still very much on the table. This would be the only
>> model supported by a vSMMU emulation for example, since the SMMU
>> doesn't
>> have PASID allocation commands.
> I didn't get how ATS works in such case, if device ATC PASID is different
> from IOTLB ASID. Who will be responsible for translation in-between?

ATS with the SMMU works like this:

* The PCI function sends a Translation Request with PASID.
* The SMMU walks the PASID table (which we call context descriptor
table), finds the context descriptor indexed by PASID. This context
descriptor has an ASID field, and a page directory pointer.
* After successfully walking the page tables, the SMMU may add an IOTLB
entry tagged by ASID and address, then returns a Translation Completion.
* The PCI function adds an ATC entry tagged by PASID and address.

I think the ASID on Arm CPUs is roughly equivalent to Intel PCID. One
reason we use ASIDs for IOTLBs is that with SVA, the ASID of an address
space is the same on the CPU side. And when the CPU executes a TLB
invalidation instructions, it also invalidates the corresponding IOTLB
entries. It's nice for vSVA because you don't need to context-switch to
the host to send an IOTLB invalidation. But only non-PCI devices that
implement SVA benefit from this at the moment, because ATC invalidations
still have to go through the SMMU command queue.