Re: [PATCH RFC v4 10/44] KVM: guest_memfd: Add support for KVM_SET_MEMORY_ATTRIBUTES2

From: Michael Roth

Date: Tue Apr 07 2026 - 18:10:04 EST

On Tue, Apr 07, 2026 at 02:50:58PM -0700, Vishal Annapurve wrote:
> On Tue, Apr 7, 2026 at 2:09 PM Michael Roth <michael.roth@xxxxxxx> wrote:
> >
> > > TLDR:
> > >
> > > + Think of populate ioctls not as KVM touching memory, but platform
> > > handling population.
> > > + KVM code (kvm_gmem_populate) still doesn't touch memory contents
> > > + post_populate is platform-specific code that handles loading into
> > > private destination memory just to support legacy non-in-place
> > > conversion.
> > > + Don't complicate populate ioctls by doing conversion just to support
> > > legacy use-cases where platform-specific code has to do copying on
> > > the host.
> >
> > That's a good point: these are only considerations in the context of
> > actually copying from src->dst, but with in-place conversion the
> > primary/more-performant approach will be for userspace to initial
> > directly. I.e. if we enforced that, then gmem could right ascertain that
> > it isn't even writing to private pages via these hooks and any
> > manipulation of that memory is purely on the part of the trusted entity
> > handling initial encryption/etc.
> >
> > I understand that we decided to keep the option of allowing separate
> > src/dst even with in-place conversion, but it doesn't seem worthwhile if
> > that necessarily means we need to glue population+conversion together in
> > 1 clumsy interface that needs to handle partial return/error responses to
> > userspace (or potentially get stuck forever in the conversion path).
>
> I think ARM needs userspace to specify separate source and destination
> memory ranges for initial population as ARM doesn't support in-place
> memory encryption. [1]
>
> [1] https://lore.kernel.org/kvm/20260318155413.793430-25-steven.price@xxxxxxx/
>
> >
> > So I agree with Ackerley's proposal (which I guess is the same as what's
> > in this series).
> >
> > However, 1 other alternative would be to do what was suggested on the
> > call, but require userspace to subsequently handle the shared->private
> > conversion. I think that would be workable too.
>
> IIUC, Converting memory ranges to private after it essentially is
> treated as private by the KVM CC backend will expose the
> implementation to the same risk of userspace being able to access
> private memory and compromise host safety which guest_memfd was
> invented to address.

Doh, fair point. Doing conversion as part of the populate call would allow
us to use the filemap write-lock to avoid userspace being able to fault
in private (as tracked by trusted entity) pages before they are
transitioned to private (as tracked by KVM), so it's safer than having
userspace drive it.

But obviously I still think Ackerley's original proposal has more
upsides than the alternatives mentioned so far.

-Mike

>
> >
> > One other benefit to Ackerley's/current approach however is that it allows
> > us to potentially keep hugepages intact in the populate path, since
> > prep'ing/encrypting everything while it's in a shared state means gmem will
> > split the hugepage and all the firmware/RMP/etc. data structures will only
> > be able to handle individual 4K pages. I still suspect doing things like
> > encoding the initial 2MB OVMF image as a single hugepage might yield
> > enough benefit to explore this (at some point). So there's some niceness
> > in knowing that Ackerley's approach would allow for that eventually and
> > not require a complete rethink on these same topics.
> >
> > Thanks,
> >
> > Mike
> >
> > >
> > > >>>
> > > >>> [...snip...]
> > > >>>