Re: [RFC] /dev/ioasid uAPI proposal

From: David Gibson
Date: Thu Jun 03 2021 - 02:28:42 EST


On Tue, Jun 01, 2021 at 07:09:21PM +0800, Lu Baolu wrote:
> Hi Jason,
>
> On 2021/5/29 7:36, Jason Gunthorpe wrote:
> > > /*
> > > * Bind an user-managed I/O page table with the IOMMU
> > > *
> > > * Because user page table is untrusted, IOASID nesting must be enabled
> > > * for this ioasid so the kernel can enforce its DMA isolation policy
> > > * through the parent ioasid.
> > > *
> > > * Pgtable binding protocol is different from DMA mapping. The latter
> > > * has the I/O page table constructed by the kernel and updated
> > > * according to user MAP/UNMAP commands. With pgtable binding the
> > > * whole page table is created and updated by userspace, thus different
> > > * set of commands are required (bind, iotlb invalidation, page fault, etc.).
> > > *
> > > * Because the page table is directly walked by the IOMMU, the user
> > > * must use a format compatible to the underlying hardware. It can
> > > * check the format information through IOASID_GET_INFO.
> > > *
> > > * The page table is bound to the IOMMU according to the routing
> > > * information of each attached device under the specified IOASID. The
> > > * routing information (RID and optional PASID) is registered when a
> > > * device is attached to this IOASID through VFIO uAPI.
> > > *
> > > * Input parameters:
> > > * - child_ioasid;
> > > * - address of the user page table;
> > > * - formats (vendor, address_width, etc.);
> > > *
> > > * Return: 0 on success, -errno on failure.
> > > */
> > > #define IOASID_BIND_PGTABLE _IO(IOASID_TYPE, IOASID_BASE + 9)
> > > #define IOASID_UNBIND_PGTABLE _IO(IOASID_TYPE, IOASID_BASE + 10)
> > Also feels backwards, why wouldn't we specify this, and the required
> > page table format, during alloc time?
> >
>
> Thinking of the required page table format, perhaps we should shed more
> light on the page table of an IOASID. So far, an IOASID might represent
> one of the following page tables (might be more):
>
> 1) an IOMMU format page table (a.k.a. iommu_domain)
> 2) a user application CPU page table (SVA for example)
> 3) a KVM EPT (future option)
> 4) a VM guest managed page table (nesting mode)
>
> This version only covers 1) and 4). Do you think we need to support 2),

Isn't (2) the equivalent of using the using the host-managed pagetable
then doing a giant MAP of all your user address space into it? But
maybe we should identify that case explicitly in case the host can
optimize it.

> 3) and beyond? If so, it seems that we need some in-kernel helpers and
> uAPIs to support pre-installing a page table to IOASID. From this point
> of view an IOASID is actually not just a variant of iommu_domain, but an
> I/O page table representation in a broader sense.
>
> Best regards,
> baolu
>

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature