Re: [RFC PATCH v5 44/45] KVM: x86/mmu: Add support for splitting S-EPT hugepages on conversion

From: Yan Zhao

Date: Wed Feb 11 2026 - 03:47:14 EST

On Thu, Jan 29, 2026 at 07:39:27AM -0800, Sean Christopherson wrote:
> Compile tested only...
It passed my local tests with the fix [1].

[1] https://lore.kernel.org/all/aYX-RpxDYrI65XRC@xxxxxxxxxx.

> @@ -1950,6 +1950,7 @@ struct kvm_x86_ops {
> void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu);
> int (*gmem_prepare)(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order);
> void (*gmem_invalidate)(kvm_pfn_t start, kvm_pfn_t end);
> + int (*gmem_convert)(struct kvm *kvm, gfn_t start, gfn_t end, bool to_private);
> int (*gmem_max_mapping_level)(struct kvm *kvm, kvm_pfn_t pfn, bool is_private);
> };
Since tdx_gmem_convert() only performs S-EPT splitting on the specified range,
would it make sense to rename the op gmem_convert() to something like
gmem_split_private_mapping()?
(This would also involve renaming
kvm_gmem_convert() to kvm_gmem_split_private_mapping(), and
kvm_arch_gmem_convert() to kvm_arch_gmem_split_private_mapping()).

This way, it's natural for it to be called by kvm_gmem_set_attributes() for
private-to-shared conversions, kvm_gmem_punch_hole(), or kvm_gmem_error_folio().

> +static int tdx_gmem_convert(struct kvm *kvm, gfn_t start, gfn_t end,
> + bool to_private)
> +{
> + /*
> + * When converting from private=>shared, KVM must first split potential
> + * hugepages, as KVM mustn't overzap private mappings for TDX guests,
> + * i.e. must zap _exactly_ [start, end). Split potential hugepages at
> + * the head and tail of the to-be-converted (and thus zapped) range so
> + * that KVM doesn't overzap due to dropping a hugepage that doesn't
> + * fall wholly inside the range.
> + */
> + if (to_private || !kvm_has_mirrored_tdp(kvm))
> + return 0;
> +
> + /*
> + * Acquire the external cache lock, a.k.a. the Dynamic PAMT lock, to
> + * protect the per-VM cache of pre-allocate pages used to populate the
> + * Dynamic PAMT when splitting S-EPT huge pages.
> + */
> + guard(mutex)(&to_kvm_tdx(kvm)->pamt_cache_lock);
Thanks for the change from spinlock to mutex, which is a smart approach that
eliminates the need to release the lock for topup.

However, I have a question about kvm_tdp_mmu_try_split_huge_pages(), which is
called by dirty page tracking related functions. I'm not sure if we might want
to invoke them from a non-vCPU thread for mirror roots in the future. If that's
the case, would they need some way to acquire this lock?

> + guard(write_lock)(&kvm->mmu_lock);
> +
> + /*
> + * TODO: Also split from PG_LEVEL_1G => PG_LEVEL_2M when KVM supports
> + * 1GiB S-EPT pages.
> + */
> + return tdx_sept_split_huge_pages(kvm, start, end, PG_LEVEL_4K);
> +}
> +
> diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
> index f444fc84d93b..2bb4604a64ca 100644
> --- a/arch/x86/kvm/vmx/tdx.h
> +++ b/arch/x86/kvm/vmx/tdx.h
> @@ -48,6 +48,9 @@ struct kvm_tdx {
> * Set/unset is protected with kvm->mmu_lock.
> */
> bool wait_for_sept_zap;
> +
> + struct tdx_pamt_cache pamt_cache;
> + struct mutex pamt_cache_lock;
> };
>
> /* TDX module vCPU states */
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c80cc60e7862..c3d71ba9a1dc 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -14061,7 +14061,7 @@ void kvm_arch_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end)
> int kvm_arch_gmem_convert(struct kvm *kvm, gfn_t start, gfn_t end,
> bool to_private)
> {
> - return 0;
> + return kvm_x86_call(gmem_convert)(kvm, start, end, to_private);
> }
> #endif
> #endif
>
> base-commit: b2791d61e9774d8575525816e864d2e09ee9090a
> --