Re: [PATCH v5 08/10] KVM: x86: nSVM: Save/restore gPAT with KVM_{GET,SET}_NESTED_STATE

From: Jim Mattson

Date: Thu Mar 05 2026 - 13:49:28 EST


On Wed, Mar 4, 2026 at 11:55 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Wed, Mar 04, 2026, Yosry Ahmed wrote:
> > On Wed, Mar 4, 2026 at 11:11 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Mar 04, 2026, Jim Mattson wrote:
> > > > On Wed, Mar 4, 2026 at 9:11 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > > > index 991ee4c03363..099bf8ac10ee 100644
> > > > > --- a/arch/x86/kvm/svm/nested.c
> > > > > +++ b/arch/x86/kvm/svm/nested.c
> > > > > @@ -1848,7 +1848,7 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
> > > > > if (is_guest_mode(vcpu)) {
> > > > > kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb12_gpa;
> > > > > if (nested_npt_enabled(svm)) {
> > > > > - kvm_state.hdr.svm.flags |= KVM_STATE_SVM_VALID_GPAT;
> > > > > + kvm_state->flags |= KVM_STATE_NESTED_GPAT_VALID;
> > > > > kvm_state.hdr.svm.gpat = svm->vmcb->save.g_pat;
> > > > > }
> > > > > kvm_state.size += KVM_STATE_NESTED_SVM_VMCB_SIZE;
> > > > > @@ -1914,7 +1914,8 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
> > > > >
> > > > > if (kvm_state->flags & ~(KVM_STATE_NESTED_GUEST_MODE |
> > > > > KVM_STATE_NESTED_RUN_PENDING |
> > > > > - KVM_STATE_NESTED_GIF_SET))
> > > > > + KVM_STATE_NESTED_GIF_SET |
> > > > > + KVM_STATE_NESTED_GPAT_VALID))
> > > > > return -EINVAL;
> > > >
> > > > Unless I'm missing something, this breaks forward compatibility
> > > > completely. An older kernel will refuse to accept a nested state blob
> > > > with GPAT_VALID set.
> > >
> > > Argh, so we've painted ourselves into an impossible situation by restricting the
> > > set of valid flags. I.e. VMX's omission of checks on unknown flags is a feature,
> > > not a bug.
> > >
> > > Chatted with Jim offlist, and he pointed out that KVM's standard way to deal with
> > > this is to make setting the flag opt-in, e.g. KVM_CAP_X86_TRIPLE_FAULT_EVENT and
> > > KVM_CAP_EXCEPTION_PAYLOAD.
> > >
> > > As much as I want to retroactively change KVM's documentation to state doing
> > > KVM_SET_NESTED_STATE with data that didn't come from KVM_GET_NESTED_STATE is
> > > unsupported, that feels too restrictive and could really bite us in the future.
> > > And it doesn't help if there's already userspace that's putting garbage into the
> > > header.
> > >
> > > So yeah, I don't see a better option than adding yet another capability.

Capability or quirk?

/me ducks.