Re: [PATCH v2 1/8] KVM: x86: nSVM: Redirect IA32_PAT accesses to either hPAT or gPAT
From: Jim Mattson
Date: Mon Feb 02 2026 - 15:02:23 EST
On Wed, Jan 21, 2026 at 5:21 PM Yosry Ahmed <yosry.ahmed@xxxxxxxxx> wrote:
>
> On Thu, Jan 15, 2026 at 03:21:40PM -0800, Jim Mattson wrote:
> > When the vCPU is in guest mode with nested NPT enabled, guest accesses to
> > IA32_PAT are redirected to the gPAT register, which is stored in
> > vmcb02->save.g_pat.
> >
> > Non-guest accesses (e.g. from userspace) to IA32_PAT are always redirected
> > to hPAT, which is stored in vcpu->arch.pat.
> >
> > This is architected behavior. It also makes it possible to restore a new
> > checkpoint on an old kernel with reasonable semantics. After the restore,
> > gPAT will be lost, and L2 will run on L1's PAT. Note that the old kernel
> > would have always run L2 on L1's PAT.
>
> This creates a difference in MSR_IA32_CR_PAT handling with nested SVM
> and nested VMX, right?
Correct.
> AFAICT, reading MSR_IA32_CR_PAT while an L2 VMX guest is running will
> read L2's PAT. With this change, the same scenario on SVM will read L1's
> PAT. We can claim that it was always L1's PAT though, because we've
> always been running L2 with L1's PAT.
>
> I am just raising this in case it's a problem to have different behavior
> for SVM and VMX. I understand that we need to do this to be able to
> save/restore L1's PAT with SVM in guest mode and maintain backward
> compatibility.
>
> IIUC VMX does not have the same issue because host and guest PAT are
> both in vmcs12 and are both saved/restored appropriately.
Correct.
> >
> > Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx>
> > ---
> > arch/x86/kvm/svm/svm.c | 31 ++++++++++++++++++++++++-------
> > 1 file changed, 24 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 7041498a8091..3f8581adf0c1 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2846,6 +2846,13 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > case MSR_AMD64_DE_CFG:
> > msr_info->data = svm->msr_decfg;
> > break;
> > + case MSR_IA32_CR_PAT:
> > + if (!msr_info->host_initiated && is_guest_mode(vcpu) &&
> > + nested_npt_enabled(svm))
> > + msr_info->data = svm->vmcb->save.g_pat; /* gPAT */
> > + else
> > + msr_info->data = vcpu->arch.pat; /* hPAT */
> > + break;
> > default:
> > return kvm_get_msr_common(vcpu, msr_info);
> > }
> > @@ -2929,14 +2936,24 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
> >
> > break;
> > case MSR_IA32_CR_PAT:
> > - ret = kvm_set_msr_common(vcpu, msr);
> > - if (ret)
> > - break;
> > + if (!kvm_pat_valid(data))
> > + return 1;
> >
> > - svm->vmcb01.ptr->save.g_pat = data;
>
> This is a bug fix, L2 is now able to alter L1's PAT, right?
Yes, L1 and L2 share a PAT today. The whole series is fixing that mistake.
> > - if (is_guest_mode(vcpu))
> > - nested_vmcb02_compute_g_pat(svm);
> > - vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
>
> This looks like another bug fix, it seems like we'll update vmcb01 but
> clear the clean bit in vmcb02 if we're in guest mode.
I will split this out into a separate patch, since this is tangential
to the rest of the series.
> Probably worth calling these out (and CC:stable, Fixes:.., etc)?
>
> We probably need a comment here explaining the gPAT vs hPAT case, I
> don't think it's super obvious why we only redirect L2's own accesses to
> its PAT but not userspace's.
Point taken.
> > + if (!msr->host_initiated && is_guest_mode(vcpu) &&
> > + nested_npt_enabled(svm)) {
> > + svm->vmcb->save.g_pat = data; /* gPAT */
> > + vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
> > + } else {
> > + vcpu->arch.pat = data; /* hPAT */
>
> Should we call kvm_set_msr_common() here instead of setting
> vcpu->arch.pat? The kvm_pat_valid() call would be redundant but that
> should be fine. My main concern is if kvm_set_msr_common() gains more
> logic for MSR_IA32_CR_PAT that isn't reflected here. Probably unlikely
> tho..
There are already three open-coded assignments to vcpu->arch.pat on
the VMX side. Perhaps this should be avoided, but it's tangential to
this series.
> > + if (npt_enabled) {
> > + svm->vmcb01.ptr->save.g_pat = data;
> > + vmcb_mark_dirty(svm->vmcb01.ptr, VMCB_NPT);
> > + if (is_guest_mode(vcpu)) {
>
> IIUC (with the fix you mentioned) this is because L1 and L2 share the
> PAT if nested NPT is disabled, and if we're already in guest mode then
> we also need to update vmcb02 (as it was already created based on vmcb01
> with the old PAT). Probably worth a comment.
Noted.
> > + svm->vmcb->save.g_pat = data;
> > + vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
> > + }
> > + }
> > + }
> > break;
> > case MSR_IA32_SPEC_CTRL:
> > if (!msr->host_initiated &&
> > --
> > 2.52.0.457.g6b5491de43-goog
> >