Re: [RFC] /dev/ioasid uAPI proposal

From: Lu Baolu
Date: Tue Jun 01 2021 - 07:09:34 EST


Hi Jason,

On 2021/5/29 7:36, Jason Gunthorpe wrote:
/*
* Bind an user-managed I/O page table with the IOMMU
*
* Because user page table is untrusted, IOASID nesting must be enabled
* for this ioasid so the kernel can enforce its DMA isolation policy
* through the parent ioasid.
*
* Pgtable binding protocol is different from DMA mapping. The latter
* has the I/O page table constructed by the kernel and updated
* according to user MAP/UNMAP commands. With pgtable binding the
* whole page table is created and updated by userspace, thus different
* set of commands are required (bind, iotlb invalidation, page fault, etc.).
*
* Because the page table is directly walked by the IOMMU, the user
* must use a format compatible to the underlying hardware. It can
* check the format information through IOASID_GET_INFO.
*
* The page table is bound to the IOMMU according to the routing
* information of each attached device under the specified IOASID. The
* routing information (RID and optional PASID) is registered when a
* device is attached to this IOASID through VFIO uAPI.
*
* Input parameters:
* - child_ioasid;
* - address of the user page table;
* - formats (vendor, address_width, etc.);
*
* Return: 0 on success, -errno on failure.
*/
#define IOASID_BIND_PGTABLE _IO(IOASID_TYPE, IOASID_BASE + 9)
#define IOASID_UNBIND_PGTABLE _IO(IOASID_TYPE, IOASID_BASE + 10)
Also feels backwards, why wouldn't we specify this, and the required
page table format, during alloc time?


Thinking of the required page table format, perhaps we should shed more
light on the page table of an IOASID. So far, an IOASID might represent
one of the following page tables (might be more):

1) an IOMMU format page table (a.k.a. iommu_domain)
2) a user application CPU page table (SVA for example)
3) a KVM EPT (future option)
4) a VM guest managed page table (nesting mode)

This version only covers 1) and 4). Do you think we need to support 2),
3) and beyond? If so, it seems that we need some in-kernel helpers and
uAPIs to support pre-installing a page table to IOASID. From this point
of view an IOASID is actually not just a variant of iommu_domain, but an
I/O page table representation in a broader sense.

Best regards,
baolu