Re: [RFC PATCH v2 41/69] KVM: x86: Add infrastructure for stolen GPA bits

From: Edgecombe, Rick P
Date: Thu Aug 05 2021 - 14:43:42 EST


On Thu, 2021-08-05 at 17:39 +0000, Sean Christopherson wrote:
> On Thu, Aug 05, 2021, Edgecombe, Rick P wrote:
> > On Thu, 2021-08-05 at 16:06 +0000, Sean Christopherson wrote:
> > > On Thu, Aug 05, 2021, Kai Huang wrote:
> > > > And removing 'gfn_stolen_bits' in 'struct kvm_mmu_page' could
> > > > also save
> > > > some memory.
> > >
> > > But I do like saving memory... One potentially bad idea would be
> > > to
> > > unionize gfn and stolen bits by shifting the stolen bits after
> > > they're
> > > extracted from the gpa, e.g.
> > >
> > > union {
> > > gfn_t gfn_and_stolen;
> > > struct {
> > > gfn_t gfn:52;
> > > gfn_t stolen:12;
> > > }
> > > };
> > >
> > > the downsides being that accessing just the gfn would require an
> > > additional
> > > masking operation, and the stolen bits wouldn't align with
> > > reality.
> >
> > It definitely seems like the sp could be packed more efficiently.
>
> Yeah, in general it could be optimized. But for TDP/direct MMUs, we
> don't care
> thaaat much because there are relatively few shadow pages, versus
> indirect MMUs
> with thousands or tens of thousands of shadow pages. Of course,
> indirect MMUs
> are also the most gluttonous due to the unsync_child_bitmap, gfns,
> write flooding
> count, etc...
>
> If we really want to reduce the memory footprint for the common case
> (TDP MMU),
> the crud that's used only by indirect shadow pages could be shoved
> into a
> different struct by abusing the struct layout and and wrapping
> accesses to the
> indirect-only fields with casts/container_of and helpers, e.g.
>
Wow, didn't realize classic MMU was that relegated already. Mostly an
onlooker here, but does TDX actually need classic MMU support? Nice to
have?

> struct kvm_mmu_indirect_page {
> struct kvm_mmu_page this;
>
> gfn_t *gfns;
> unsigned int unsync_children;
> DECLARE_BITMAP(unsync_child_bitmap, 512);
>
> #ifdef CONFIG_X86_32
> /*
> * Used out of the mmu-lock to avoid reading spte values while
> an
> * update is in progress; see the comments in
> __get_spte_lockless().
> */
> int clear_spte_count;
> #endif
>
> /* Number of writes since the last time traversal visited this
> page. */
> atomic_t write_flooding_count;
> }
>
>
> > One other idea is the stolen bits could just be recovered from the
> > role
> > bits with a helper, like how the page fault error code stolen bits
> > encoding version of this works.
>
> As in, a generic "stolen_gfn_bits" in the role instead of a per-
> feature role bit?
> That would avoid the problem of per-feature role bits leading to a
> pile of
> marshalling code, and wouldn't suffer the masking cost when accessing
> ->gfn,
> though I'm not sure that matters much.
Well I was thinking multiple types of aliases, like the pf err code
stuff works, like this:

gfn_t stolen_bits(struct kvm *kvm, struct kvm_mmu_page *sp)
{
gfn_t stolen = 0;

if (sp->role.shared)
stolen |= kvm->arch.gfn_shared_mask;
if (sp->role.other_alias)
stolen |= kvm->arch.gfn_other_mask;

return stolen;
}

But yea, there really only needs to be one. Still bit shifting seems
better.

>
> > If the stolen bits are not fed into the hash calculation though it
> > would change the behavior a bit. Not sure if for better or worse.
> > Also
> > the calculation of hash collisions would need to be aware.
>
> The role is already factored into the collision logic.
I mean how aliases of the same gfn don't necessarily collide and the
collisions counter is only incremented if the gfn/stolen matches, but
not if the role is different.

>
> > FWIW, I kind of like something like Sean's proposal. It's a bit
> > convoluted, but there are more unused bits in the gfn than the
> > role.
>
> And tightly bound, i.e. there can't be more than gfn_t gfn+gfn_stolen
> bits.
>
> > Also they are a little more related.
>
>