Re: [RFC PATCH 05/13] iommufd: Serialise persisted iommufds and ioas

From: Jason Gunthorpe
Date: Mon Oct 07 2024 - 11:16:47 EST


On Mon, Oct 07, 2024 at 08:39:53AM +0000, Gowans, James wrote:

> 2. Get userspace to do the work: userspace needs to re-do the ioctls
> after kexec to reconstruct the objects. My main issue with this approach
> is that the kernel needs to do some sort of trust but verify approach to
> ensure that userspace constructs everything the same way after kexec as
> it was before kexec. We don't want to end up in a state where the
> iommufd objects don't match the persisted page tables.

I think the verification is not so bad, it is just extracting the
physical addresses from the IOAS and comparing to what is stored in
the iommu_domain. If they don't match then the domain can't be adopted
to the IOAS.

We actually don't care about anything else, if userspace creates
different objects with different parameters who cares? All that
matters is that the radix tree contains the same expected information.

> What do you think of this 3rd approach? I can try to sketch it out and
> send another RFC if you think it sounds reasonable.

I think it is the same problem, just in a more maintainable
wrapper. You still have to serialize lots and lots of different
objects and their relationships.

> > Ie "recover" a HWPT from a KHO on a manually created a IOAS with the
> > right "memfd" for the backing storage. Then the recovery can just
> > validate that things are correct and adopt the iommu_domain as the
> > hwpt.
>
> This sounds more like option 2 where we expect userspace to re-drive the
> ioctls, but verify that they have corresponding payloads as before kexec
> so that iommufd objects are consistent with persisted page tables.

Yes

> If the kernel is doing verification wouldn't it be better for the kernel
> to do the ioctl work itself and give the resulting objects to
> userspace?

No :)

It is so much easier to validate the IOPTEs in a radix tree.

At the very worst you just create a HWPT and iommu_domain for
validation, do validation and then throw it away. Compare for two
radix trees is about 50 lines in generic pt, I have it already.

Jason