Re: [RFC PATCH v2 41/69] KVM: x86: Add infrastructure for stolen GPA bits

From: Sean Christopherson
Date: Thu Aug 05 2021 - 13:39:17 EST


On Thu, Aug 05, 2021, Edgecombe, Rick P wrote:
> On Thu, 2021-08-05 at 16:06 +0000, Sean Christopherson wrote:
> > On Thu, Aug 05, 2021, Kai Huang wrote:
> > > And removing 'gfn_stolen_bits' in 'struct kvm_mmu_page' could also save
> > > some memory.
> >
> > But I do like saving memory... One potentially bad idea would be to
> > unionize gfn and stolen bits by shifting the stolen bits after they're
> > extracted from the gpa, e.g.
> >
> > union {
> > gfn_t gfn_and_stolen;
> > struct {
> > gfn_t gfn:52;
> > gfn_t stolen:12;
> > }
> > };
> >
> > the downsides being that accessing just the gfn would require an additional
> > masking operation, and the stolen bits wouldn't align with reality.
>
> It definitely seems like the sp could be packed more efficiently.

Yeah, in general it could be optimized. But for TDP/direct MMUs, we don't care
thaaat much because there are relatively few shadow pages, versus indirect MMUs
with thousands or tens of thousands of shadow pages. Of course, indirect MMUs
are also the most gluttonous due to the unsync_child_bitmap, gfns, write flooding
count, etc...

If we really want to reduce the memory footprint for the common case (TDP MMU),
the crud that's used only by indirect shadow pages could be shoved into a
different struct by abusing the struct layout and and wrapping accesses to the
indirect-only fields with casts/container_of and helpers, e.g.

struct kvm_mmu_indirect_page {
struct kvm_mmu_page this;

gfn_t *gfns;
unsigned int unsync_children;
DECLARE_BITMAP(unsync_child_bitmap, 512);

#ifdef CONFIG_X86_32
/*
* Used out of the mmu-lock to avoid reading spte values while an
* update is in progress; see the comments in __get_spte_lockless().
*/
int clear_spte_count;
#endif

/* Number of writes since the last time traversal visited this page. */
atomic_t write_flooding_count;
}


> One other idea is the stolen bits could just be recovered from the role
> bits with a helper, like how the page fault error code stolen bits
> encoding version of this works.

As in, a generic "stolen_gfn_bits" in the role instead of a per-feature role bit?
That would avoid the problem of per-feature role bits leading to a pile of
marshalling code, and wouldn't suffer the masking cost when accessing ->gfn,
though I'm not sure that matters much.

> If the stolen bits are not fed into the hash calculation though it
> would change the behavior a bit. Not sure if for better or worse. Also
> the calculation of hash collisions would need to be aware.

The role is already factored into the collision logic.

> FWIW, I kind of like something like Sean's proposal. It's a bit
> convoluted, but there are more unused bits in the gfn than the role.

And tightly bound, i.e. there can't be more than gfn_t gfn+gfn_stolen bits.

> Also they are a little more related.