Re: [PATCH v2 15/15] KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU

From: Isaku Yamahata
Date: Wed Jun 12 2024 - 14:56:16 EST


On Fri, Jun 07, 2024 at 11:39:14PM +0000,
"Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx> wrote:

> On Fri, 2024-06-07 at 11:31 +0200, Paolo Bonzini wrote:
> > > -int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
> > > -                        int *root_level)
> > > +static int __kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64
> > > *sptes,
> > > +                                 enum kvm_tdp_mmu_root_types root_type)
> > >   {
> > > -       struct kvm_mmu_page *root = root_to_sp(vcpu->arch.mmu->root.hpa);
> > > +       struct kvm_mmu_page *root = tdp_mmu_get_root(vcpu, root_type);
> >
> > I think this function should take the struct kvm_mmu_page * directly.
> >
> > > +{
> > > +       *root_level = vcpu->arch.mmu->root_role.level;
> > > +
> > > +       return __kvm_tdp_mmu_get_walk(vcpu, addr, sptes, KVM_DIRECT_ROOTS);
> >
> > Here you pass root_to_sp(vcpu->arch.mmu->root.hpa);
>
> I see. It is another case of more indirection to try to send the decision making
> through the helpers. We can try to open code things more.
>
> >
> > > +int kvm_tdp_mmu_get_walk_mirror_pfn(struct kvm_vcpu *vcpu, u64 gpa,
> > > +                                    kvm_pfn_t *pfn)
> > > +{
> > > +       u64 sptes[PT64_ROOT_MAX_LEVEL + 1], spte;
> > > +       int leaf;
> > > +
> > > +       lockdep_assert_held(&vcpu->kvm->mmu_lock);
> > > +
> > > +       rcu_read_lock();
> > > +       leaf = __kvm_tdp_mmu_get_walk(vcpu, gpa, sptes, KVM_MIRROR_ROOTS);
> >
> > and likewise here.
> >
> > You might also consider using a kvm_mmu_root_info for the mirror root,
> > even though the pgd field is not used.
>
> This came up on the last version actually. The reason against it was that it
> used that tiny bit of extra memory for the pgd. It does look more symmetrical
> though.
>
> >
> > Then __kvm_tdp_mmu_get_walk() can take a struct kvm_mmu_root_info * instead.
>
> Ahh, I see. Yes, that's a good reason.
>
> >
> > kvm_tdp_mmu_get_walk_mirror_pfn() doesn't belong in this series, but
> > introducing __kvm_tdp_mmu_get_walk() can stay here.
>
> Ok, we can split it.
>
> >
> > Looking at the later patch, which uses
> > kvm_tdp_mmu_get_walk_mirror_pfn(), I think this function is a bit
> > overkill. I'll do a pre-review of the init_mem_region function,
> > especially the usage of kvm_gmem_populate:
> >
> > +    slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
> > +    if (!kvm_slot_can_be_private(slot) || !kvm_mem_is_private(kvm, gfn)) {
> > +        ret = -EFAULT;
> > +        goto out_put_page;
> > +    }
> >
> > The slots_lock is taken, so checking kvm_slot_can_be_private is unnecessary.
> >
> > Checking kvm_mem_is_private perhaps should also be done in
> > kvm_gmem_populate() itself. I'll send a patch.
> >
> > +    read_lock(&kvm->mmu_lock);
> > +
> > +    ret = kvm_tdp_mmu_get_walk_mirror_pfn(vcpu, gpa, &mmu_pfn);
> > +    if (ret < 0)
> > +        goto out;
> > +    if (ret > PG_LEVEL_4K) {
> > +        ret = -EINVAL;
> > +        goto out;
> > +    }
> > +    if (mmu_pfn != pfn) {
> > +        ret = -EAGAIN;
> > +        goto out;
> > +    }
> >
> > If you require pre-faulting, you don't need to return mmu_pfn and
> > things would be seriously wrong if the two didn't match, wouldn't
> > they?
>
> Yea, I'm not sure why it would be a normal condition. Maybe Isaku can comment on
> the thinking?

Sean suggested for KVM_TDX_INIT_MEM_REGION to check if the two pfn from TDP MMU
and guest_memfd match. As pointed out, the two PFNs should match under
appropriate lock (or heavily broken). Personally I'm fine to remove such check
and to avoid returning pfn.

https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@xxxxxxxxxx/

Then KVM provides a dedicated TDX ioctl(), i.e. what is/was KVM_TDX_INIT_MEM_REGION,
to do PAGE.ADD. KVM_TDX_INIT_MEM_REGION wouldn't need to map anything, it would
simply need to verify that the pfn from guest_memfd() is the same as what's in
the TDP MMU.
--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>