Re: [PATCH v1 02/14] iommufd: Add nesting related data structures for ARM SMMUv3
From: Eric Auger
Date: Fri Mar 10 2023 - 06:34:12 EST
Hi,
On 3/9/23 14:42, Jean-Philippe Brucker wrote:
> Hi Nicolin,
>
> On Thu, Mar 09, 2023 at 02:53:38AM -0800, Nicolin Chen wrote:
>> Add the following data structures for corresponding ioctls:
>> iommu_hwpt_arm_smmuv3 => IOMMUFD_CMD_HWPT_ALLOC
>> iommu_hwpt_invalidate_arm_smmuv3 => IOMMUFD_CMD_HWPT_INVALIDATE
>>
>> Also, add IOMMU_HW_INFO_TYPE_ARM_SMMUV3 and IOMMU_PGTBL_TYPE_ARM_SMMUV3_S1
>> to the header and corresponding type/size arrays.
>>
>> Signed-off-by: Nicolin Chen <nicolinc@xxxxxxxxxx>
>> +/**
>> + * struct iommu_hwpt_arm_smmuv3 - ARM SMMUv3 specific page table data
>> + *
>> + * @flags: page table entry attributes
>> + * @s2vmid: Virtual machine identifier
>> + * @s1ctxptr: Stage-1 context descriptor pointer
>> + * @s1cdmax: Number of CDs pointed to by s1ContextPtr
>> + * @s1fmt: Stage-1 Format
>> + * @s1dss: Default substream
>> + */
>> +struct iommu_hwpt_arm_smmuv3 {
>> +#define IOMMU_SMMUV3_FLAG_S2 (1 << 0) /* if unset, stage-1 */
> I don't understand the purpose of this flag, since the structure only
> provides stage-1 configuration fields
>
>> +#define IOMMU_SMMUV3_FLAG_VMID (1 << 1) /* vmid override */
> Doesn't this break isolation? The VMID space is global for the SMMU, so
> the guest could access cached mappings of another device
>
>> + __u64 flags;
>> + __u32 s2vmid;
>> + __u32 __reserved;
>> + __u64 s1ctxptr;
>> + __u64 s1cdmax;
>> + __u64 s1fmt;
>> + __u64 s1dss;
>> +};
>> +
>
>> +/**
>> + * struct iommu_hwpt_invalidate_arm_smmuv3 - ARM SMMUv3 cahce invalidation info
>> + * @flags: boolean attributes of cache invalidation command
>> + * @opcode: opcode of cache invalidation command
>> + * @ssid: SubStream ID
>> + * @granule_size: page/block size of the mapping in bytes
>> + * @range: IOVA range to invalidate
>> + */
>> +struct iommu_hwpt_invalidate_arm_smmuv3 {
>> +#define IOMMU_SMMUV3_CMDQ_TLBI_VA_LEAF (1 << 0)
>> + __u64 flags;
>> + __u8 opcode;
>> + __u8 padding[3];
>> + __u32 asid;
>> + __u32 ssid;
>> + __u32 granule_size;
>> + struct iommu_iova_range range;
>> +};
> Although we can keep the alloc and hardware info separate for each IOMMU
> architecture, we should try to come up with common invalidation methods.
>
> It matters because things like vSVA, or just efficient dynamic mappings,
> will require optimal invalidation latency. A paravirtual interface like
> vhost-iommu can help with that, as the host kernel will handle guest
> invalidations directly instead of doing a round-trip to host userspace
> (and we'll likely want the same path for PRI.)
>
> Supporting HW page tables for a common PV IOMMU does require some
> architecture-specific knowledge, but invalidation messages contain roughly
> the same information on all architectures. The PV IOMMU won't include
> command opcodes for each possible architecture if one generic command does
> the same job.
>
> Ideally I'd like something like this for vhost-iommu:
>
> * slow path through userspace for complex requests like attach-table and
> probe, where the VMM can decode arch-specific information and translate
> them to iommufd and vhost-iommu ioctls to update the configuration.
>
> * fast path within the kernel for performance-critical requests like
> invalidate, page request and response. It would be absurd for the
> vhost-iommu driver to translate generic invalidation requests from the
> guest into arch-specific commands with special opcodes, when the next
> step is calling the IOMMU driver which does that for free.
>
> During previous discussions we came up with generic invalidations that
> could fit both Arm and x86 [1][2]. The only difference was the ASID
> (called archid/id in those proposals) which VT-d didn't need. Could we try
> to build on that?
I do agree with Jean. We spent a lot of efforts all together to define
this generic invalidation API and if there is compelling reason that
prevents from using it, we should try to reuse it.
Thanks
Eric
>
> [1] https://elixir.bootlin.com/linux/v5.17/source/include/uapi/linux/iommu.h#L161
> [2] https://lists.oasis-open.org/archives/virtio-dev/202102/msg00014.html
>
> Thanks,
> Jean
>