Re: [RFC PATCH kernel] iommufd: Allow mapping from KVM's guest_memfd

From: Xu Yilun

Date: Fri Feb 27 2026 - 23:34:22 EST


On Fri, Feb 27, 2026 at 09:18:15AM -0400, Jason Gunthorpe wrote:
> On Fri, Feb 27, 2026 at 06:35:44PM +0800, Xu Yilun wrote:
>
> > Will cause host machine check and host restart, same as host CPU
> > accessing encrypted memory. Intel TDX has no lower level privilege
> > protection table so the wrong accessing will actually impact the
> > memory encryption engine.
>
> Blah, of course it does.
>
> So Intel needs a two step synchronization to wipe the IOPTEs before
> any shared private conversions and restore the right ones after.

Mainly about shared IOPTE (for both T=0 table & T=1 table): "unmap
before conversion to private" & "map after conversion to shared"

I see there are already some consideration in QEMU to support in-place
conversion + shared passthrough [*], using uptr, but seems that's
exactly what you are objecting to.

[*]: https://lore.kernel.org/all/18f64464-2ead-42d4-aeaa-f781020dca05@xxxxxxxxx/

For Intel, T=1 private IOPTE reuses S-EPT, this is the real CC business
and the correctness is managed by KVM & firmware, no notification
needed.

Further more, I think "unmap shared IOPTE before conversion to private"
may be the only concern to ensure kernel safety, other steps could be
fully left to userspace. Hope the downgrading from "remap" to
"invalidate" simplifies the notification.

>
> AMD needs a nasty HW synchronization with RMP changes, but otherwise
> wants to map the entire physical space.
>
> ARM doesn't care much, I think it could safely do either approach?
>
> These are very different behaviors so I would expect that userspace
> needs to signal which of the two it wants.
>
> It feels like we need a fairly complex dedicated synchronization logic
> in iommufd coupled to the shared/private machinery in guestmemfd
>
> Not really sure how to implement the Intel version right now, it is
> sort of like a nasty version of SVA..
>
> Jason