Re: [RFC] /dev/ioasid uAPI proposal

From: Lu Baolu
Date: Thu Jun 03 2021 - 01:50:55 EST


On 6/3/21 7:23 AM, Jason Gunthorpe wrote:
On Wed, Jun 02, 2021 at 12:01:57PM +0800, Lu Baolu wrote:
On 6/2/21 1:26 AM, Jason Gunthorpe wrote:
On Tue, Jun 01, 2021 at 07:09:21PM +0800, Lu Baolu wrote:

This version only covers 1) and 4). Do you think we need to support 2),
3) and beyond?

Yes aboslutely. The API should be flexable enough to specify the
creation of all future page table formats we'd want to have and all HW
specific details on those formats.

OK, stay in the same line.

If so, it seems that we need some in-kernel helpers and uAPIs to
support pre-installing a page table to IOASID.

Not sure what this means..

Sorry that I didn't make this clear.

Let me bring back the page table types in my eyes.

1) IOMMU format page table (a.k.a. iommu_domain)
2) user application CPU page table (SVA for example)
3) KVM EPT (future option)
4) VM guest managed page table (nesting mode)

Each type of page table should be able to be associated with its IOASID.
We have BIND protocol for 4); We explicitly allocate an iommu_domain for
1). But we don't have a clear definition for 2) 3) and others. I think
it's necessary to clearly define a time point and kAPI name between
IOASID_ALLOC and IOASID_ATTACH, so that other modules have the
opportunity to associate their page table with the allocated IOASID
before attaching the page table to the real IOMMU hardware.

In my mind these are all actions of creation..

#1 is ALLOC_IOASID 'to be compatible with thes devices attached to
this FD'
#2 is ALLOC_IOASID_SVA
#3 is some ALLOC_IOASID_KVM (and maybe the kvm fd has to issue this ioctl)
#4 is ALLOC_IOASID_USER_PAGE_TABLE w/ user VA address or
ALLOC_IOASID_NESTED_PAGE_TABLE w/ IOVA address

Each allocation should have a set of operations that are allows
map/unmap is only legal on #1. invalidate is only legal on #4, etc.

This sounds reasonable. The corresponding page table types and required
callbacks are also part of it.


How you want to split this up in the ioctl interface is a more
interesting question. I generally like more calls than giant unwieldly
multiplexer structs, but some things are naturally flags and optional
modifications of a single ioctl.

In any event they should have a similar naming 'ALLOC_IOASID_XXX' and
then a single 'DESTROY_IOASID' that works on all of them.

I/O page fault handling is similar. The provider of the page table
should take the responsibility to handle the possible page faults.

For the faultable types, yes #3 and #4 should hook in the fault
handler and deal with it.

Agreed.

Best regards,
baolu