Re: [RFC] /dev/ioasid uAPI proposal

From: Jason Gunthorpe
Date: Wed Jun 02 2021 - 19:23:59 EST

On Wed, Jun 02, 2021 at 12:01:57PM +0800, Lu Baolu wrote:
> On 6/2/21 1:26 AM, Jason Gunthorpe wrote:
> > On Tue, Jun 01, 2021 at 07:09:21PM +0800, Lu Baolu wrote:
> >
> > > This version only covers 1) and 4). Do you think we need to support 2),
> > > 3) and beyond?
> >
> > Yes aboslutely. The API should be flexable enough to specify the
> > creation of all future page table formats we'd want to have and all HW
> > specific details on those formats.
> OK, stay in the same line.
> > > If so, it seems that we need some in-kernel helpers and uAPIs to
> > > support pre-installing a page table to IOASID.
> >
> > Not sure what this means..
> Sorry that I didn't make this clear.
> Let me bring back the page table types in my eyes.
> 1) IOMMU format page table (a.k.a. iommu_domain)
> 2) user application CPU page table (SVA for example)
> 3) KVM EPT (future option)
> 4) VM guest managed page table (nesting mode)
> Each type of page table should be able to be associated with its IOASID.
> We have BIND protocol for 4); We explicitly allocate an iommu_domain for
> 1). But we don't have a clear definition for 2) 3) and others. I think
> it's necessary to clearly define a time point and kAPI name between
> IOASID_ALLOC and IOASID_ATTACH, so that other modules have the
> opportunity to associate their page table with the allocated IOASID
> before attaching the page table to the real IOMMU hardware.

In my mind these are all actions of creation..

#1 is ALLOC_IOASID 'to be compatible with thes devices attached to
this FD'
#3 is some ALLOC_IOASID_KVM (and maybe the kvm fd has to issue this ioctl)
#4 is ALLOC_IOASID_USER_PAGE_TABLE w/ user VA address or

Each allocation should have a set of operations that are allows
map/unmap is only legal on #1. invalidate is only legal on #4, etc.

How you want to split this up in the ioctl interface is a more
interesting question. I generally like more calls than giant unwieldly
multiplexer structs, but some things are naturally flags and optional
modifications of a single ioctl.

In any event they should have a similar naming 'ALLOC_IOASID_XXX' and
then a single 'DESTROY_IOASID' that works on all of them.

> I/O page fault handling is similar. The provider of the page table
> should take the responsibility to handle the possible page faults.

For the faultable types, yes #3 and #4 should hook in the fault
handler and deal with it.