Re: [RFC PATCH v2 41/69] KVM: x86: Add infrastructure for stolen GPA bits

From: Edgecombe, Rick P
Date: Thu Aug 05 2021 - 13:07:19 EST


On Thu, 2021-08-05 at 16:06 +0000, Sean Christopherson wrote:
> On Thu, Aug 05, 2021, Kai Huang wrote:
> > On Fri, 2 Jul 2021 15:04:47 -0700 isaku.yamahata@xxxxxxxxx wrote:
> > > From: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
> > > @@ -2020,6 +2032,7 @@ static struct kvm_mmu_page
> > > *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
> > > sp = kvm_mmu_alloc_page(vcpu, direct);
> > >
> > > sp->gfn = gfn;
> > > + sp->gfn_stolen_bits = gfn_stolen_bits;
> > > sp->role = role;
> > > hlist_add_head(&sp->hash_link, sp_list);
> > > if (!direct) {
> > > @@ -2044,6 +2057,13 @@ static struct kvm_mmu_page
> > > *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
> > > return sp;
> > > }
> >
> >
> > Sorry for replying old thread,
>
> Ha, one month isn't old, it's barely even mature.
>
> > but to me it looks weird to have gfn_stolen_bits
> > in 'struct kvm_mmu_page'. If I understand correctly, above code
> > basically
> > means that GFN with different stolen bit will have different
> > 'struct
> > kvm_mmu_page', but in the context of this patch, mappings with
> > different
> > stolen bits still use the same root,
>
> You're conflating "mapping" with "PTE". The GFN is a per-PTE
> value. Yes, there
> is a final GFN that is representative of the mapping, but more
> directly the final
> GFN is associated with the leaf PTE.
>
> TDX effectively adds the restriction that all PTEs used for a mapping
> must have
> the same shared/private status, so mapping and PTE are somewhat
> interchangeable
> when talking about stolen bits (the shared bit), but in the context
> of this patch,
> the stolen bits are a property of the PTE.
>
> Back to your statement, it's incorrect. PTEs (effectively mappings
> in TDX) with
> different stolen bits will _not_ use the same
> root. kvm_mmu_get_page() includes
> the stolen bits in both the hash lookup and in the comparison, i.e.
> restores the
> stolen bits when looking for an existing shadow page at the target
> GFN.
>
> @@ -1978,9 +1990,9 @@ static struct kvm_mmu_page
> *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
> role.quadrant = quadrant;
> }
>
> - sp_list = &vcpu->kvm-
> >arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
> + sp_list = &vcpu->kvm-
> >arch.mmu_page_hash[kvm_page_table_hashfn(gfn_and_stolen)];
> for_each_valid_sp(vcpu->kvm, sp, sp_list) {
> - if (sp->gfn != gfn) {
> + if ((sp->gfn | sp->gfn_stolen_bits) !=
> gfn_and_stolen) {
> collisions++;
> continue;
> }
>
> > which means gfn_stolen_bits doesn't make a lot of sense at least
> > for root
> > page table.
>
> It does make sense, even without a follow-up patch. In Rick's
> original series,
> stealing a bit for execute-only guest memory, there was only a single
> root. And
> except for TDX, there can only ever be a single root because the
> shared EPTP isn't
> usable, i.e. there's only the regular/private EPTP.
>
> > Instead, having gfn_stolen_bits in 'struct kvm_mmu_page' only makes
> > sense in
> > the context of TDX, since TDX requires two separate roots for
> > private and
> > shared mappings.
> > So given we cannot tell whether the same root, or different roots
> > should be
> > used for different stolen bits, I think we should not add
> > 'gfn_stolen_bits' to
> > 'struct kvm_mmu_page' and use it to determine whether to allocate a
> > new table
> > for the same GFN, but should use a new role (i.e role.private) to
> > determine.
>
> A new role would work, too, but it has the disadvantage of not
> automagically
> working for all uses of stolen bits, e.g. XO support would have to
> add another
> role bit.
>
> > And removing 'gfn_stolen_bits' in 'struct kvm_mmu_page' could also
> > save some
> > memory.
>
> But I do like saving memory... One potentially bad idea would be to
> unionize
> gfn and stolen bits by shifting the stolen bits after they're
> extracted from the
> gpa, e.g.
>
> union {
> gfn_t gfn_and_stolen;
> struct {
> gfn_t gfn:52;
> gfn_t stolen:12;
> }
> };
>
> the downsides being that accessing just the gfn would require an
> additional masking
> operation, and the stolen bits wouldn't align with reality.

It definitely seems like the sp could be packed more efficiently.
One other idea is the stolen bits could just be recovered from the role
bits with a helper, like how the page fault error code stolen bits
encoding version of this works.

If the stolen bits are not fed into the hash calculation though it
would change the behavior a bit. Not sure if for better or worse. Also
the calculation of hash collisions would need to be aware.

FWIW, I kind of like something like Sean's proposal. It's a bit
convoluted, but there are more unused bits in the gfn than the role.
Also they are a little more related.