Re: [PATCH 5/7] KVM: x86/mmu: Introduce kvm_tdp_map_page() to populate guest memory

From: Paolo Bonzini
Date: Wed Apr 17 2024 - 17:32:24 EST

Next message: Sean Christopherson: "Re: [PATCH 5/7] KVM: x86/mmu: Introduce kvm_tdp_map_page() to populate guest memory"
Previous message: Andrew Morton: "Re: [syzbot] [mm?] KASAN: slab-use-after-free Read in __vma_reservation_common"
In reply to: Sean Christopherson: "Re: [PATCH 5/7] KVM: x86/mmu: Introduce kvm_tdp_map_page() to populate guest memory"
Next in thread: Sean Christopherson: "Re: [PATCH 5/7] KVM: x86/mmu: Introduce kvm_tdp_map_page() to populate guest memory"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Apr 17, 2024 at 11:24 PM Sean Christopherson <seanjc@googlecom> wrote:
>
> On Wed, Apr 17, 2024, Paolo Bonzini wrote:
> > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> >
> > Introduce a helper function to call the KVM fault handler. It allows a new
> > ioctl to invoke the KVM fault handler to populate without seeing RET_PF_*
> > enums or other KVM MMU internal definitions because RET_PF_* are internal
> > to x86 KVM MMU. The implementation is restricted to two-dimensional paging
> > for simplicity. The shadow paging uses GVA for faulting instead of L1 GPA.
> > It makes the API difficult to use.
> >
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> > Message-ID: <9b866a0ae7147f96571c439e75429a03dcb659b6.1712785629.git.isaku.yamahata@xxxxxxxxx>
> > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > ---
> > arch/x86/kvm/mmu.h | 3 +++
> > arch/x86/kvm/mmu/mmu.c | 32 ++++++++++++++++++++++++++++++++
> > 2 files changed, 35 insertions(+)
> >
> > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> > index e8b620a85627..51ff4f67e115 100644
> > --- a/arch/x86/kvm/mmu.h
> > +++ b/arch/x86/kvm/mmu.h
> > @@ -183,6 +183,9 @@ static inline void kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu,
> > __kvm_mmu_refresh_passthrough_bits(vcpu, mmu);
> > }
> >
> > +int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code,
> > + u8 *level);
> > +
> > /*
> > * Check if a given access (described through the I/D, W/R and U/S bits of a
> > * page fault error code pfec) causes a permission fault with the given PTE
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 7fbcfc97edcc..fb2149d16f8d 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -4646,6 +4646,38 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> > return direct_page_fault(vcpu, fault);
> > }
> >
> > +int kvm_tdp_map_page(struct kvm_vcpu *vcpu, gpa_t gpa, u64 error_code,
> > + u8 *level)
>
> If the return is an overloaded "long", then there's no need for @level, ie. do
> the level=>size conversion in this helper.
>
> > +{
> > + int r;
> > +
> > + /* Restrict to TDP page fault. */
>
> Do we want to restrict this to the TDP MMU? Not for any particular reason, mostly
> just to keep moving towards officially deprecating/removing TDP support from the
> shadow MMU.

Heh, yet another thing I briefly thought about while reviewing Isaku's
work. In the end I decided that, with the implementation being just a
regular prefault, there's not much to save from keeping the shadow MMU
away from this.

The real ugly part is that if the memslots are zapped the
pre-population effect basically goes away (damn
kvm_arch_flush_shadow_memslot). This is the reason why I initially
thought of KVM_CHECK_EXTENSION for the VM file descriptor, to only
allow this for TDX VMs.

The real solution for this is to not "graduate" this ioctl too soon to
kvm/next. Let's keep it in kvm-coco-queue until TDX is ready and then
make a final decision.

Paolo

> > + if (vcpu->arch.mmu->page_fault != kvm_tdp_page_fault)
> > + return -EOPNOTSUPP;
> > +
> > + r = __kvm_mmu_do_page_fault(vcpu, gpa, error_code, true, NULL, level);
> > + if (r < 0)
> > + return r;
> > +
> > + switch (r) {
> > + case RET_PF_RETRY:
> > + return -EAGAIN;
> > +
> > + case RET_PF_FIXED:
> > + case RET_PF_SPURIOUS:
> > + return 0;
>
> Going with the "long" idea, this becomes:
>
> end = (gpa & KVM_HPAGE_MASK(level)) + KVM_HPAGE_SIZE(level);
> return min(size, end - gpa);
>
> though I would vote for a:
>
> break;
>
> so that the happy path is nicely isolated at the end of the function.
>
> > +
> > + case RET_PF_EMULATE:
> > + return -EINVAL;
> > +
> > + case RET_PF_CONTINUE:
> > + case RET_PF_INVALID:
> > + default:
> > + WARN_ON_ONCE(r);
> > + return -EIO;
> > + }
> > +}
> > +
> > static void nonpaging_init_context(struct kvm_mmu *context)
> > {
> > context->page_fault = nonpaging_page_fault;
> > --
> > 2.43.0
> >
> >
>

Next message: Sean Christopherson: "Re: [PATCH 5/7] KVM: x86/mmu: Introduce kvm_tdp_map_page() to populate guest memory"
Previous message: Andrew Morton: "Re: [syzbot] [mm?] KASAN: slab-use-after-free Read in __vma_reservation_common"
In reply to: Sean Christopherson: "Re: [PATCH 5/7] KVM: x86/mmu: Introduce kvm_tdp_map_page() to populate guest memory"
Next in thread: Sean Christopherson: "Re: [PATCH 5/7] KVM: x86/mmu: Introduce kvm_tdp_map_page() to populate guest memory"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]