RE: [PATCH v1 5/8] vfio/type1: Report 1st-level/stage-1 format to userspace
From: Liu, Yi L
Date: Fri Apr 10 2020 - 08:30:51 EST
Hi Jean, Eric,
> From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> Sent: Thursday, April 9, 2020 8:47 PM
> Subject: RE: [PATCH v1 5/8] vfio/type1: Report 1st-level/stage-1 format to
> userspace
>
[...]
> > > >>
> > > >> Yes I don't think an u32 is going to cut it for Arm :( We need to
> > > >> describe all sorts
> > of
> > > >> capabilities for page and PASID tables (granules, GPA size, ASID/PASID size,
> HW
> > > >> access/dirty, etc etc.) Just saying "Arm stage-1 format" wouldn't mean
> much. I
> > > >> guess we could have a secondary vendor capability for these?
> > > >
> > > > Actually, I'm wondering if we can define some formats to stands for a set of
> > > > capabilities. e.g. VTD_STAGE1_FORMAT_V1 which may indicates the 1st
> level
> > > > page table related caps (aw, a/d, SRE, EA and etc.). And vIOMMU can parse
> > > > the capabilities.
> > >
> > > But eventually do we really need all those capability getters? I mean
> > > can't we simply rely on the actual call to VFIO_IOMMU_BIND_GUEST_PGTBL()
> > > to detect any mismatch? Definitively the error handling may be heavier
> > > on userspace but can't we manage.
> >
> > I think we need to present these capabilities at boot time, long before
> > the guest triggers a bind(). For example if the host SMMU doesn't support
> > 16-bit ASID, we need to communicate that to the guest using vSMMU ID
> > registers or PROBE properties. Otherwise a bind() will succeed, but if the
> > guest uses 16-bit ASIDs in its CD, DMA will result in C_BAD_CD events
> > which we'll inject into the guest, for no apparent reason from their
> > perspective.
> >
> > In addition some VMMs may have fallbacks if shared page tables are not
> > available. They could fall back to a MAP/UNMAP interface, or simply not
> > present a vIOMMU to the guest.
> >
>
> Based on the comments, I think it would be a need to report iommu caps
> in detail. So I guess iommu uapi needs to provide something alike vfio
> cap chain in iommu uapi. Please feel free let me know your thoughts. :-)
Consider more, I guess it may be better to start simpler. Cap chain suits
the case in which there are multiple caps. e.g. some vendor iommu driver
may want to report iommu capabilities via multiple caps. Actually, in VT-d
side, the host IOMMU capability could be reported in a single cap structure.
I'm not sure about ARM side. Will there be multiple iommu_info_caps for ARM?
> In vfio, we can define a cap as below:
>
> struct vfio_iommu_type1_info_cap_nesting {
> struct vfio_info_cap_header header;
> __u64 iommu_model;
> #define VFIO_IOMMU_PASID_REQS (1 << 0)
> #define VFIO_IOMMU_BIND_GPASID (1 << 1)
> #define VFIO_IOMMU_CACHE_INV (1 << 2)
> __u32 nesting_capabilities;
> __u32 pasid_bits;
> #define VFIO_IOMMU_VENDOR_SUB_CAP (1 << 3)
> __u32 flags;
> __u32 data_size;
> __u8 data[]; /*iommu info caps defined by iommu uapi */
> };
>
If iommu vendor driver only needs one cap structure to report hw
capability, then I think we needn't implement cap chain in iommu
uapi. The @data[] field could be determined by the @iommu_model
and @flags fields. This would be easier. thoughts?
> VFIO needs new iommu APIs to ask iommu driver whether PASID/bind_gpasid/
> cache_inv/bind_gpasid_table is available or not and also the pasid
> bits. After that VFIO will ask iommu driver about the iommu_cap_info
> and fill in the @data[] field.
>
> iommu uapi:
> struct iommu_info_cap_header {
> __u16 id; /* Identifies capability */
> __u16 version; /* Version specific to the capability ID */
> __u32 next; /* Offset of next capability */
> };
>
> #define IOMMU_INFO_CAP_INTEL_VTD 1
> struct iommu_info_cap_intel_vtd {
> struct iommu_info_cap_header header;
> __u32 vaddr_width; /* VA addr_width*/
> __u32 ipaddr_width; /* IPA addr_width, input of SL page table */
> /* same definition with @flags instruct iommu_gpasid_bind_data_vtd */
> __u64 flags;
> };
>
> #define IOMMU_INFO_CAP_ARM_SMMUv3 2
> struct iommu_info_cap_arm_smmuv3 {
> struct iommu_info_cap_header header;
> ...
> };
Regards,
Yi Liu