Re: [PATCH 2/7] KVM: x86/MMU: Move rmap_iterator to rmap.h

From: Ben Gardon
Date: Wed Dec 14 2022 - 12:53:23 EST


On Tue, Dec 13, 2022 at 4:59 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Tue, Dec 13, 2022, Ben Gardon wrote:
> > On Fri, Dec 9, 2022 at 3:04 PM David Matlack <dmatlack@xxxxxxxxxx> wrote:
> > >
> > > > +/*
> > > > + * Used by the following functions to iterate through the sptes linked by a
> > > > + * rmap. All fields are private and not assumed to be used outside.
> > > > + */
> > > > +struct rmap_iterator {
> > > > + /* private fields */
> > > > + struct pte_list_desc *desc; /* holds the sptep if not NULL */
> > > > + int pos; /* index of the sptep */
> > > > +};
> > > > +
> > > > +u64 *rmap_get_first(struct kvm_rmap_head *rmap_head,
> > > > + struct rmap_iterator *iter);
> > > > +u64 *rmap_get_next(struct rmap_iterator *iter);
> > > > +
> > > > +#define for_each_rmap_spte(_rmap_head_, _iter_, _spte_) \
> > > > + for (_spte_ = rmap_get_first(_rmap_head_, _iter_); \
> > > > + _spte_; _spte_ = rmap_get_next(_iter_))
> > > > +
> > >
> > > I always found these function names and kvm_rmap_head confusing since
>
> Heh, you definitely aren't the only one.
>
> > > they are about iterating through the pte_list_desc data structure. The
> > > rmap (gfn -> list of sptes) is a specific application of the
> > > pte_list_desc structure, but not the only application. There's also
> > > parent_ptes in struct kvm_mmu_page, which is not an rmap, just a plain
> > > old list of ptes.
> >
> > > While you are refactoring this code, what do you think about doing the
> > > following renames?
> > >
> > > struct kvm_rmap_head -> struct pte_list_head
> > > struct rmap_iterator -> struct pte_list_iterator
> > > rmap_get_first() -> pte_list_get_first()
> > > rmap_get_next() -> pte_list_get_next()
> > > for_each_rmap_spte() -> for_each_pte_list_entry()
>
> I would strongly prefer to keep "spte" in this one regardless of what other naming
> changes we do (see below). Maybe just for_each_spte()? IMO, "pte_list_entry"
> unnecessarily obfuscates that it's a list of SPTEs.
>
> > > Then we can reserve the term "rmap" just for the actual rmap
> > > (slot->arch.rmap), and code that deals with sp->parent_ptes will become
> > > a lot more clear IMO (because it will not longer mention rmap).
> > >
> > > e.g. We go from this:
> > >
> > > struct rmap_iterator iter;
> > > u64 *sptep;
> > >
> > > for_each_rmap_spte(&sp->parent_ptes, &iter, sptep) {
> > > ...
> > > }
> > >
> > > To this:
> > >
> > > struct pte_list_iterator iter;
> > > u64 *sptep;
> > >
> > > for_each_pte_list_entry(&sp->parent_ptes, &iter, sptep) {
> > > ...
> > > }
> >
> > I like this suggestion, and I do think it'll make things more
> > readable. It's going to be a huge patch to rename all the instances of
> > kvm_rmap_head, but it's probably worth it.
>
> I generally like this idea too, but tying into my above comment, before jumping
> in I think we should figure out what end state we want, i.e. get the bikeshedding
> out of the way now to hopefully avoid dragging out a series while various things
> get nitpicked.
>
> E.g. if we if we just rename the structs and their macros, then we'll end up with
> things like
>
> static bool slot_rmap_write_protect(struct kvm *kvm,
> struct pte_list_head *rmap_head,
> const struct kvm_memory_slot *slot)
> {
> return rmap_write_protect(rmap_head, false);
> }
>
> which isn't terrible, but there's still opportunity for cleanup, e.g.
> rmap_write_protect() could easily be sptes_write_protect() or write_protect_sptes().
>
> That will generate a naming conflict of sorts with pte_list_head if we don't also
> rename that to spte_list_head. And I think capturing that it's a list of SPTEs and
> not guest PTEs will be helpful in general.
>
> And if we rename pte_list_head, then we might as well commit 100% and use consisnent
> nomenclature across the board, e.g. end up with
>
> static bool sptes_clear_dirty(struct kvm *kvm, struct sptes_list_head *head,
> const struct kvm_memory_slot *slot)
> {
> u64 *sptep;
> struct spte_list_iterator iter;
> bool flush = false;
>
> for_each_spte(head, &iter, sptep) {
> if (spte_ad_need_write_protect(*sptep))
> flush |= spte_wrprot_for_clear_dirty(sptep);
> else
> flush |= spte_clear_dirty(sptep);
> }
>
> return flush;
> }
>
> versus the current
>
> static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
> const struct kvm_memory_slot *slot)
> {
> u64 *sptep;
> struct rmap_iterator iter;
> bool flush = false;
>
> for_each_rmap_spte(rmap_head, &iter, sptep)
> if (spte_ad_need_write_protect(*sptep))
> flush |= spte_wrprot_for_clear_dirty(sptep);
> else
> flush |= spte_clear_dirty(sptep);
>
> return flush;
> }

I'd be happy to see some consistent SPTE-based naming in the Shadow
MMU and more or less get rid of the rmap naming scheme. Once you
change to spte_list_head or whatever, the use of the actual rmap (an
array of spte_list_heads) becomes super narrow.

Given the potential for enormous scope creep on what's already going
to be a long series, I'm inclined to split this work into two parts:
1. Move code from mmu.c to shadow_mmu.c with minimal cleanups /
refactors / renames; just move the code
2. Clean up naming conventions: make the functions exported in
shadow_mmu.h consistent, get rid of the whole rmap naming scheme, etc.

That way git-blame will preserve context around the renames /
refactors which would be obfuscated if we did 2 before 1, and we can
reduce merge conflicts.