Re: [PATCH v7 07/42] KVM: guest_memfd: Only prepare folios for private pages

From: Michael Roth

Date: Wed Jun 03 2026 - 10:03:42 EST

On Wed, Jun 03, 2026 at 09:58:45AM +0100, Suzuki K Poulose wrote:
> On 02/06/2026 23:41, Ackerley Tng wrote:
> > Suzuki K Poulose <suzuki.poulose@xxxxxxx> writes:
> >
> > >
> > > [...snip...]
> > >
> > > > > @@ -914,7 +916,8 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct
> > > > > kvm_memory_slot *slot,
> > > > >           folio_mark_uptodate(folio);
> > > > >       }
> > > > > -    r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio);
> > > > > +    if (kvm_gmem_is_private_mem(inode, index))
> > > >
> > > > Don't we need to make sure the entire folio is private ? Not just the
> > > > page at the index ?
> > > >     if (kvm_gmem_range_is_private(, index, folio_nr_pages(folio)) ?
> >
> > I was thinking to fix this when I do huge pages, for now guest_memfd is
> > always just PAGE_SIZE, so just looking up index is fine.
> >
> > Is that okay?
>
> Thats fine, but would be good to enforce that here, so that we don't miss
> out when we add support for multi page folios.

We sort of already enforce that in kvm_gmem_get_folio():

/*
* External interfaces like kvm_gmem_get_pfn() support dealing
* with hugepages to a degree, but internally, guest_memfd currently
* assumes that all folios are order-0 and handling would need
* to be updated for anything otherwise (e.g. page-clearing
* operations).
*/
WARN_ON_ONCE(!IS_ERR(folio) && folio_order(folio));

which was done as part of:

commit 6538b6221cc2feda415ca1946e66a5ef02dc6a0a
Author: Michael Roth <michael.roth@xxxxxxx>
Date: Thu Jan 8 15:46:18 2026 -0600

KVM: guest_memfd: Remove partial hugepage handling from kvm_gmem_populate()

and that should trigger before you even reach the prepare path, so I think
that's covered.

In general, there some previous discussion where we decided we would stop wasting
time guessing at what we'll need to do for hugepages and instead just strip out
the partial support. Sean wanted the folio order kept at part of the internal API
since we know MMU will need that one way or another, but elsewhere within
guest_memfd we are okay to assume 4K. If we *know* certain points that will need
to change then a comment mentioning it isn't a bad idea, but even those comments
have tended to be wrong so far about exactly what changes are supposed to happen.

I'm not sure where the original discussion happened but there's some aftermath
discussion here[1] that I think summarizes current [non-]plans around
prepare+hugepages.

[1] https://lore.kernel.org/kvm/20250711163440.kwjebnzd7zeb4bxt@xxxxxxx/

>
> >
> > >
> > > Or rather, we should go through the individual pages and apply the
> > > prepare for ones that are private ?
> > >
> > > Suzuki
> > >
> >
> > IIRC the plan was to make kvm_gmem_prepare_folio() idempotent, as in, if
> > a page is already private, just skip. Currently sev_gmem_prepare() does
> > a pr_debug(), which I guess is technically still idempotent.
> >
> > I'm thinking that the information tha needs tracking to make
> > .gmem_prepare() idempotent should be tracked by arch code.
> >
> > Does this work for ARM CCA?
>
> We don't hook into the prepare yet, but have plans to do that. We should
> be able to handle the pages that are already private. (For CCA context,
> RMI_GRANULE_DELEGATE_RANGE can skip over already REALM pages). So this
> should be fine.
>
> My point is, in a given folio, there may be pages that are shared.
> Like you said, this could be dealt with when we support hugepages.

Sounds good, that's also what SNP will do once hugepages come along.

-Mike

>
> Suzuki
>
>
> >
> > > >
> > > > [...snip...]
> > > >
>