Re: [PATCH 3/4] KVM: nSVM: invalidate cached PDPTRs across nested NPT transitions

From: Paolo Bonzini

Date: Sat May 30 2026 - 12:54:31 EST


On Thu, May 28, 2026 at 8:33 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
> > First, the PDPTRs are cached processor state but not a field of the
> > VMCB; changing the VMCB should have no effect on them. For SVM, changes
> > to their cache state are purely a result of writes to CR3 or CR4.PAE.
>
> Huh?
>
> The SDM says, "The behavior of PAE mode in a nested-paging guest
> differs slightly from the behavior of (host-only) legacy PAE mode, in
> that the guest’s four PDPEs are not loaded into the processor at the
> time CR3 is written. Instead, the PDPEs are accessed on demand as part
> of a table walk. This has the side-effect that illegal bit
> combinations in the PDPEs are not signaled at the time that CR3 is
> written, but instead when the faulty PDPE is accessed as part of a
> table walk.
>
> So, they are only cached as part of a partial walk, just like the
> entries at any other level. And, unlike Intel, changes to the PDPTRs
> in memory *may* be visible in a future page walk without changing CR3.

Yes, this is not in conflict with what I said because I was referring
to invalidation of cached PDPTRs.

On AMD, the PDPTRs are just cached partial walk. How and whether they
are invalidated depends on how KVM's nested vmentry handles the writes
to CR3 or CR4.PAE. The APM says that, architecturally, writes to the
paging-related control registers do not have to flush the TLB, similar
to Intel with VPID. Hence, in principle, a walk associated to a given
L1 ASID could survive across multiple L2 vmentries.

Instead, on Intel the PDPTR values are not associated to a VPID, they
are just there. Just the fact that you're changing the VMCS has an
effect on PDPTRs because their values now come from the new VMCS.

So while vmx_switch_vmcs() needs to mark the PDPTRs as not dirty/not
available, svm_switch_vmcb() does not have to do it, and it can be
done when setting up the MMU for the new values of the control
registers.

Paolo