RE: [RFC] /dev/ioasid uAPI proposal

From: Tian, Kevin
Date: Tue Jun 01 2021 - 04:10:21 EST


> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Saturday, May 29, 2021 1:36 AM
>
> On Thu, May 27, 2021 at 07:58:12AM +0000, Tian, Kevin wrote:
>
> > IOASID nesting can be implemented in two ways: hardware nesting and
> > software nesting. With hardware support the child and parent I/O page
> > tables are walked consecutively by the IOMMU to form a nested translation.
> > When it's implemented in software, the ioasid driver is responsible for
> > merging the two-level mappings into a single-level shadow I/O page table.
> > Software nesting requires both child/parent page tables operated through
> > the dma mapping protocol, so any change in either level can be captured
> > by the kernel to update the corresponding shadow mapping.
>
> Why? A SW emulation could do this synchronization during invalidation
> processing if invalidation contained an IOVA range.

In this proposal we differentiate between host-managed and user-
managed I/O page tables. If host-managed, the user is expected to use
map/unmap cmd explicitly upon any change required on the page table.
If user-managed, the user first binds its page table to the IOMMU and
then use invalidation cmd to flush iotlb when necessary (e.g. typically
not required when changing a PTE from non-present to present).

We expect user to use map+unmap and bind+invalidate respectively
instead of mixing them together. Following this policy, map+unmap
must be used in both levels for software nesting, so changes in either
level are captured timely to synchronize the shadow mapping.

>
> I think this document would be stronger to include some "Rational"
> statements in key places
>

Sure. I tried to provide rationale as much as possible but sometimes
it's lost in a complex context like this. :)

Thanks
Kevin