Re: [PATCH] KVM: x86/mmu: Don't create SPTEs for addresses that aren't mappable

From: Edgecombe, Rick P

Date: Mon Feb 23 2026 - 18:28:32 EST

On Fri, 2026-02-20 at 16:49 -0800, Sean Christopherson wrote:
> On Sat, Feb 21, 2026, Rick P Edgecombe wrote:
> > On Wed, 2026-02-18 at 16:22 -0800, Sean Christopherson wrote:
> > > +static void reset_tdp_unmappable_mask(struct kvm_mmu *mmu)
> > > +{
> > > + int max_addr_bit;
> > > +
> > > + switch (mmu->root_role.level) {
> > > + case PT64_ROOT_5LEVEL:
> > > + max_addr_bit = 52;
> > > + break;
> > > + case PT64_ROOT_4LEVEL:
> > > + max_addr_bit = 48;
> > > + break;
> > > + case PT32E_ROOT_LEVEL:
> > > + max_addr_bit = 32;
> > > + break;
> > > + default:
> > > + WARN_ONCE(1, "Unhandled root level %u\n", mmu-
> > > >root_role.level);
> > > + mmu->unmappable_mask = 0;
> >
> > Would it be better to set max_addr_bit to 0 and let rsvd_bits() set
> > it below? Then the unknown case is safer about rejecting things.
>
> No, because speaking from experience, rejecting isn't safer (I had a
> brain fart and thought legacy shadow paging was also affected).
> There's no danger to the host (other than the WARN itself), and so
> safety here is all about the guest.
>
> Setting unmappable_mask to -1ull is all but guaranteed to kill the
> guest, because KVM will reject all faults. Setting unmappable_mask
> to 0 is only problematic if the guest and/or userspace is
> misbehaving, and even then, the worst case scenario isn't horrific,
> all things considered.

Confused MM code makes me nervous, but fair enough.

>
> > > + return;
> > > + }
> > > +
> > > + mmu->unmappable_mask = rsvd_bits(max_addr_bit, 63);
> > > +}
> > > +
> >
> > Gosh, this forced me to expand my understanding of how the guest
> > and host page levels get glued together. Hopefully this is not too
> > far off...
> >
> > In the patch this function is passed both guest_mmu and root_mmu.
> > So sometimes it's going to be L1 GPA address, and sometimes (for
> > AMD nested?) it's going to be an L2 GVA. For the GVA case I don't
> > see how PT32_ROOT_LEVEL can be omitted. It would hit the warning?
>
> No, it's always a GPA. root_mmu translates L1 GPA => L0 GPA and L1
> GVA => GPA*; guest_mmu translates L2 GPA => L0 GPA; nested_mmu
> translates L2 GVA => L2 GPA.
>
> Note! The asterisk is that root_mmu is also used when L2 is active
> if L1 is NOT using TDP, either because KVM isn't using TDP, or
> because the L1 hypervisor decided not to. In those cases, L2 GPA ==
> L1 GPA from KVM's perspective, because the L1 hypervisor is
> responsible for shadowing L2 GVA => L1 GPA. And root_mmu can also
> translate L2 GPA => L0 GPA and L2 GVA => L2 GPA (again, L1 GPA == L2
> GPA).

I appreciate you taking the time to explain. Tracing through with the
above I realize I was under the wrong impression about how nested SVM
worked.

>
> > But also the '5' case is weird because as a GVA the max addresse
> > bits should be 57 and a GPA is should be 54.
>
> 52, i.e. the architectural max MAXPHYADDR.

Oops yes I meant 52. But if it is always max physical address and not
trying to handle VA's too, why is PT32E_ROOT_LEVEL 32 instead of
36? That also send me down the path of assuming GVAs were in the mix,
but now I see it is used for 32 bit SVM.

>
[snip]
>
>
> > So I'd think this needs a version for GVA and one for GPA.
>
> No, see the last paragraph in the changelog.
>
> Side topic, if you have _any_ idea for better names than guest_mmu
> vs. nested_mmu, speak up. This is like the fifth? time I've had a
> discussion about how awful those names are, but we've yet to come up
> with names that suck less.

I don't. As above, I got confused with some wrong assumptions. The
names seem reasonable. The short notes about the translation input and
for each MMU might be nice to have somewhere.