Re: [PATCH] KVM: x86/mmu: Don't create SPTEs for addresses that aren't mappable

From: Sean Christopherson

Date: Fri Feb 20 2026 - 19:49:24 EST

On Sat, Feb 21, 2026, Rick P Edgecombe wrote:
> On Wed, 2026-02-18 at 16:22 -0800, Sean Christopherson wrote:
> > +static void reset_tdp_unmappable_mask(struct kvm_mmu *mmu)
> > +{
> > + int max_addr_bit;
> > +
> > + switch (mmu->root_role.level) {
> > + case PT64_ROOT_5LEVEL:
> > + max_addr_bit = 52;
> > + break;
> > + case PT64_ROOT_4LEVEL:
> > + max_addr_bit = 48;
> > + break;
> > + case PT32E_ROOT_LEVEL:
> > + max_addr_bit = 32;
> > + break;
> > + default:
> > + WARN_ONCE(1, "Unhandled root level %u\n", mmu->root_role.level);
> > + mmu->unmappable_mask = 0;
>
> Would it be better to set max_addr_bit to 0 and let rsvd_bits() set it below?
> Then the unknown case is safer about rejecting things.

No, because speaking from experience, rejecting isn't safer (I had a brain fart
and thought legacy shadow paging was also affected). There's no danger to the
host (other than the WARN itself), and so safety here is all about the guest.

Setting unmappable_mask to -1ull is all but guaranteed to kill the guest, because
KVM will reject all faults. Setting unmappable_mask to 0 is only problematic if
the guest and/or userspace is misbehaving, and even then, the worst case scenario
isn't horrific, all things considered.

> > + return;
> > + }
> > +
> > + mmu->unmappable_mask = rsvd_bits(max_addr_bit, 63);
> > +}
> > +
>
> Gosh, this forced me to expand my understanding of how the guest and host page
> levels get glued together. Hopefully this is not too far off...
>
> In the patch this function is passed both guest_mmu and root_mmu. So sometimes
> it's going to be L1 GPA address, and sometimes (for AMD nested?) it's going to
> be an L2 GVA. For the GVA case I don't see how PT32_ROOT_LEVEL can be omitted.
> It would hit the warning?

No, it's always a GPA. root_mmu translates L1 GPA => L0 GPA and L1 GVA => GPA*;
guest_mmu translates L2 GPA => L0 GPA; nested_mmu translates L2 GVA => L2 GPA.

Note! The asterisk is that root_mmu is also used when L2 is active if L1 is NOT
using TDP, either because KVM isn't using TDP, or because the L1 hypervisor
decided not to. In those cases, L2 GPA == L1 GPA from KVM's perspective, because
the L1 hypervisor is responsible for shadowing L2 GVA => L1 GPA. And root_mmu
can also translate L2 GPA => L0 GPA and L2 GVA => L2 GPA (again, L1 GPA == L2 GPA).

> But also the '5' case is weird because as a GVA the max addresse bits should be
> 57 and a GPA is should be 54.

52, i.e. the architectural max MAXPHYADDR.

> And that the TDP side uses 4 and 5 specifically, so the PT64_ just happens to
> match.

No, it's not a coincidence. The "truncation" to 52 bits is an architectural
quirk. Long ago, people decided 52 bits of PA were enough for anyone, and so
repurposed bits 63:52 for e.g. NX, SUPPRESS_VE, and software-available bits.

I.e. conceptually, 5-level paging allows for 57 bits of addressing, but EPT and
NPT and NPT define bits 63:52 to be other things.

> So I'd think this needs a version for GVA and one for GPA.

No, see the last paragraph in the changelog.

Side topic, if you have _any_ idea for better names than guest_mmu vs. nested_mmu,
speak up. This is like the fifth? time I've had a discussion about how awful
those names are, but we've yet to come up with names that suck less.