Re: [PATCH v7 15/17] KVM: selftests: Allocate a dedicated guest page for x86 L2 guest stack
From: Yosry Ahmed
Date: Thu May 28 2026 - 13:58:29 EST
On Wed, May 27, 2026 at 7:56 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Wed, May 27, 2026, Yosry Ahmed wrote:
> > Instead of relying on the L1-provided stack for L2, which is usually an
> > array on L1's own stack, allocate a dedicated page of VM memory for the
> > L2 stack in vcpu_alloc_{vmx/svm}() and use that as L2's RSP in the
> > VMCS/VMCB instead of the L1-provided value.
> >
> > Most L1 guest code does not do anything with the L2 stack other than
> > stuff it in RSP, so this change is transparent and the L1-provided stack
> > is silently ignored. The only exception is memstress nested L1 code
> > which puts the vCPU index on L2's stack, so update this code to use the
> > newly allocated stack.
> >
> > L1-provided stacks will be dropped and cleaned up separately.
> >
> > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > Signed-off-by: Yosry Ahmed <yosry@xxxxxxxxxx>
> > ---
>
> Blech. This exposed a nasty edge in selftests. For tests that enable TDP, the
> slots need to be identity mapped *after* allocating SVM/VMX. Found out the hard
> way: the gPAT test starting failing.
>
> Not worth worrying about right now, just one more wart in selftests that needs
> to be cleaned up.
I would squash that diff below to the gPAT test rather than this
patch, but up to you.
Generally speaking, the tdp_identity_map_default_memslots() API sucks.
It has to be called after all mappings/allocations are done, which is
soooo user unfriendly and when it breaks it takes time to figure out
what's going on.
I think it should go away. Ideally either:
(a) When TDP is enabled in the VM, tdp_identity_map_default_memslots()
is called, and then every time we create a new mapping in the VM we
mirror it in TDP page tables.
(b) Every time we update guest mappings, we mark the TDP page tables
as "out of sync", and we "resync" on vCPU run as needed.
Either way, it is not as straightforward as just calling
tdp_identity_map_default_memslots(), because the current code reuses
the same logic as stage-1 mappings, and it fails if there's already a
PTE. So consecutive calls will fail if they try to map the whole thing
-- unless we free all TDP page tables first.
I don't think I have time to spend on this right now, mainly
documenting my thoughts, and putting this out here in case you (or
anyone else) has enough hatred for the API to go do this before I get
the time to.