Re: [RFC 17/20] iommu/iommufd: Report iova range to userspace

From: Jean-Philippe Brucker
Date: Wed Sep 29 2021 - 08:08:29 EST


On Wed, Sep 29, 2021 at 10:44:01AM +0000, Liu, Yi L wrote:
> > From: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx>
> > Sent: Wednesday, September 22, 2021 10:49 PM
> >
> > On Sun, Sep 19, 2021 at 02:38:45PM +0800, Liu Yi L wrote:
> > > [HACK. will fix in v2]
> > >
> > > IOVA range is critical info for userspace to manage DMA for an I/O address
> > > space. This patch reports the valid iova range info of a given device.
> > >
> > > Due to aforementioned hack, this info comes from the hacked vfio type1
> > > driver. To follow the same format in vfio, we also introduce a cap chain
> > > format in IOMMU_DEVICE_GET_INFO to carry the iova range info.
> > [...]
> > > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> > > index 49731be71213..f408ad3c8ade 100644
> > > --- a/include/uapi/linux/iommu.h
> > > +++ b/include/uapi/linux/iommu.h
> > > @@ -68,6 +68,7 @@
> > > * +---------------+------------+
> > > * ...
> > > * @addr_width: the address width of supported I/O address spaces.
> > > + * @cap_offset: Offset within info struct of first cap
> > > *
> > > * Availability: after device is bound to iommufd
> > > */
> > > @@ -77,9 +78,11 @@ struct iommu_device_info {
> > > #define IOMMU_DEVICE_INFO_ENFORCE_SNOOP (1 << 0) /* IOMMU
> > enforced snoop */
> > > #define IOMMU_DEVICE_INFO_PGSIZES (1 << 1) /* supported page
> > sizes */
> > > #define IOMMU_DEVICE_INFO_ADDR_WIDTH (1 << 2) /*
> > addr_wdith field valid */
> > > +#define IOMMU_DEVICE_INFO_CAPS (1 << 3) /* info
> > supports cap chain */
> > > __u64 dev_cookie;
> > > __u64 pgsize_bitmap;
> > > __u32 addr_width;
> > > + __u32 cap_offset;
> >
> > We can also add vendor-specific page table and PASID table properties as
> > capabilities, otherwise we'll need giant unions in the iommu_device_info
> > struct. That made me wonder whether pgsize and addr_width should also
> > be
> > separate capabilities for consistency, but this way might be good enough.
> > There won't be many more generic capabilities. I have "output address
> > width"
>
> what do you mean by "output address width"? Is it the output address
> of stage-1 translation?

Yes, so the guest knows the size of GPA it can write into the page table.
For Arm SMMU the GPA size is determined by both the SMMU implementation
and the host kernel configuration. But maybe that could also be
vendor-specific, if other architectures don't need to communicate it.

> >
> and "PASID width", the rest is specific to Arm and SMMU table
> > formats.
>
> When coming to nested translation support, the stage-1 related info are
> likely to be vendor-specific, and will be reported in cap chain.

Agreed

Thanks,
Jean