Re: [PATCH 1/2] x86/virt/tdx: Use PFN directly for mapping guest private memory

From: Yan Zhao

Date: Fri Mar 27 2026 - 03:44:01 EST


On Thu, Mar 26, 2026 at 12:57:26AM +0800, Edgecombe, Rick P wrote:
> On Wed, 2026-03-25 at 17:10 +0800, Yan Zhao wrote:
> > > I don't really understand what this is saying.
> > >
> > > Is the concern that KVM might want to set up page tables for memory
> > > that differ from how it was allocated? I'm a bit worried that this
> > > assumes something about folios that doesn't always hold.
> > >
> > > I think the hugetlbfs gigantic support uses folios in at least a
> > > few spots today.
> > Below is the background of this problem. I'll try to include a short
> > summary in the next version's patch logs.
>
> While this patchset is kind of pre-work for TDX huge pages, the reason
> to separate it out and push it earlier is because it has some value on
> it's own. So I'd think to focus mostly on the impact of the change
> today.
>
> How about this justification:
> 1. Because KVM handles guest memory as PFNs, and the SEAMCALLs under
> discussion are only used there, PFN is more natural.
>
> 2. The struct page was partly making sure we didn't pass a wrong arg
> (typical type safety) and partly ensuring that KVM doesn't pass non-
> convertible memory, however the SEAMCALLs themselves can check this for
> the kernel. So the case is already covered by warnings.
>
> In conclusion, the PFN is more natural and the original purpose of
> struct page is already covered.
>
>
> Sean said somewhere IIRC that he would have NAKed the struct page thing
> if he had seen it, for even the base support. And the two points above
> don't actually require discussion of even huge pages. So does it
> actually add any value to dive into the issues you list below?
I wanted to mention the issues listed below because I'm not sure if anyone has
the same question as me: why do we have to convert struct page to PFN if they
can both achieve the same purpose, given that currently all private memory
allocated by gmem has struct page backing?

The background can also reduce confusion if the patch log mentions hugepage and
single folio.

So, maybe also avoid mentioning hugepage and single folio things if you think
it's better to just mention the above two points?

> > In TDX huge page v3, I added logic that assumes PFNs are contained in
> > a single folio in both TDX's map/unmap paths [1][2]:
> > if (start_idx + npages > folio_nr_pages(folio))
> > return TDX_OPERAND_INVALID;
> > This not only assumes the PFNs have corresponding struct page, but
> > also assumes they must be contained in a single folio, since with
> > only base_page + npages, it's not easy to get the ith page's pointer
> > without first ensuring the pages are contained in a single folio.
> >
> > This should work since current KVM/guest_memfd only allocates memory
> > with struct page and maps them into S-EPT at a level lower than or
> > equal to the backend folio size. That is, a single S-EPT mapping
> > cannot span multiple backend folios.
> >
> > However, Ackerley's 1G hugetlb-based gmem splits the backend folio
> > [3] ahead of splitting/unmapping them from S-EPT [4], due to
> > implementation limitations mentioned at [5]. It makes the warning in
> > [1] hit upon invoking TDX's unmap callback.
> >
> > Moreover, Google's future gmem may manage PFNs independently in the
> > future,
>
> I think we can adapt to such changes when they eventually come up. It's
> not just about not merging code to be used by out-of-tree code. It also
> shrinks what we have to consider at each stage. So we can eventually
> get there.
Ok.

> > so TDX's private memory may have no corresponding struct page, and
> > KVM would map them via VM_PFNMAP, similar to mapping pass-through
> > MMIOs or other PFNs without struct page or with non-refcounted struct
> > page in normal VMs. Given that KVM has
> > suffered a lot from handling VM_PFNMAP memory for non-refcounted
> > struct page [6] in normal VMs, and TDX mapping/unmapping callbacks
> > have no semantic reason to dictate where and how KVM/guest_memfd
> > should allocate and map memory, Sean suggested dropping the
> > unnecessary assumption that memory to be mapped/unmapped to/from S-
> > EPT must be contained in a single folio (though he didn't object
> > reasonable sanity checks on if the PFNs are TDX convertible).
> >
> >
> > [1]
> > https://lore.kernel.org/kvm/20260106101929.24937-1-yan.y.zhao@xxxxxxxxx
> > [2]
> > https://lore.kernel.org/kvm/20260106101826.24870-1-yan.y.zhao@xxxxxxxxx
> > [3]
> > https://github.com/googleprodkernel/linux-cc/blob/wip-gmem-conversions-hugetlb-restructuring-12-08-25/virt/kvm/guest_memfd.c#L909
> > [4]
> > https://github.com/googleprodkernel/linux-cc/blob/wip-gmem-conversions-hugetlb-restructuring-12-08-25/virt/kvm/guest_memfd.c#L918
> > [5] https://lore.kernel.org/kvm/diqzqzrzdfvh.fsf@xxxxxxxxxx/
> > [6]
> > https://lore.kernel.org/all/20241010182427.1434605-1-seanjc@xxxxxxxxxx
>