Re: [RFC PATCH 05/13] iommufd: Serialise persisted iommufds and ioas

From: Gowans, James
Date: Mon Oct 07 2024 - 04:57:28 EST


On Mon, 2024-10-07 at 09:47 +0100, David Woodhouse wrote:
> On Mon, 2024-10-07 at 08:39 +0000, Gowans, James wrote:
> >
> > I think we have two other possible approaches here:
> >
> > 1. What this RFC is sketching out, serialising fields from the structs
> > and setting those fields again on deserialise. As you point out this
> > will be complicated.
> >
> > 2. Get userspace to do the work: userspace needs to re-do the ioctls
> > after kexec to reconstruct the objects. My main issue with this approach
> > is that the kernel needs to do some sort of trust but verify approach to
> > ensure that userspace constructs everything the same way after kexec as
> > it was before kexec. We don't want to end up in a state where the
> > iommufd objects don't match the persisted page tables.
>
> To what extent does the kernel really need to trust or verify? At LPC
> we seemed to speak of a model where userspace builds a "new" address
> space for each device and then atomically switches to the new page
> tables instead of the original ones inherited from the previous kernel.
>
> That does involve having space for another set of page tables, of
> course, but that's not impossible.

The idea of constructing fresh page tables and then swapping over to
that is indeed appealing, but I don't know if that's always possible.
With the ARM SMMUv3 for example I think there are break-before-make
requirement, so is it possible to do an atomic switch of the SMMUv3 page
table PGD in a hitless way? Everything here must be hitless - serialise
and deserialise must not cause any DMA faults.

If it's not possible to do a hitless atomic switch (I am unsure about
this, need to RTFM) then we're compelled to re-use the existing page
tables and if that's the case I think the kernel MUST ensure that the
iommufd IOAS object exactly match the ones before kexec. I can imagine
all sorts of mess if those go out of sync!