Re: [PATCH v7 049/102] KVM: x86/tdp_mmu: Ignore unsupported mmu operation on private GFNs

From: Isaku Yamahata
Date: Tue Jul 19 2022 - 14:03:30 EST


On Tue, Jul 12, 2022 at 10:58:06AM +0800,
Yuan Yao <yuan.yao@xxxxxxxxxxxxxxx> wrote:

> On Mon, Jun 27, 2022 at 02:53:41PM -0700, isaku.yamahata@xxxxxxxxx wrote:
> > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> >
> > Some KVM MMU operations (dirty page logging, page migration, aging page)
> > aren't supported for private GFNs (yet) with the first generation of TDX.
> > Silently return on unsupported TDX KVM MMU operations.
> >
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> > ---
> > arch/x86/kvm/mmu/tdp_mmu.c | 74 +++++++++++++++++++++++++++++++++++---
> > arch/x86/kvm/x86.c | 3 ++
> > 2 files changed, 72 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> > index 12f75e60a254..fef6246086a8 100644
> > --- a/arch/x86/kvm/mmu/tdp_mmu.c
> > +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> > @@ -387,6 +387,8 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
> >
> > if ((!is_writable_pte(old_spte) || pfn_changed) &&
> > is_writable_pte(new_spte)) {
> > + /* For memory slot operations, use GFN without aliasing */
> > + gfn = gfn & ~kvm_gfn_shared_mask(kvm);
>
> This should be part of enabling, please consider to squash it into patch 46.

Yes, merged into it.


> > slot = __gfn_to_memslot(__kvm_memslots(kvm, as_id), gfn);
> > mark_page_dirty_in_slot(kvm, slot, gfn);
> > }
> > @@ -1398,7 +1400,8 @@ typedef bool (*tdp_handler_t)(struct kvm *kvm, struct tdp_iter *iter,
> >
> > static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
> > struct kvm_gfn_range *range,
> > - tdp_handler_t handler)
> > + tdp_handler_t handler,
> > + bool only_shared)
> > {
> > struct kvm_mmu_page *root;
> > struct tdp_iter iter;
> > @@ -1409,9 +1412,23 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm,
> > * into this helper allow blocking; it'd be dead, wasteful code.
> > */
> > for_each_tdp_mmu_root(kvm, root, range->slot->as_id) {
> > + gfn_t start;
> > + gfn_t end;
> > +
> > + if (only_shared && is_private_sp(root))
> > + continue;
> > +
> > rcu_read_lock();
> >
> > - tdp_root_for_each_leaf_pte(iter, root, range->start, range->end)
> > + /*
> > + * For TDX shared mapping, set GFN shared bit to the range,
> > + * so the handler() doesn't need to set it, to avoid duplicated
> > + * code in multiple handler()s.
> > + */
> > + start = kvm_gfn_for_root(kvm, root, range->start);
> > + end = kvm_gfn_for_root(kvm, root, range->end);
> > +
> > + tdp_root_for_each_leaf_pte(iter, root, start, end)
> > ret |= handler(kvm, &iter, range);
> >
> > rcu_read_unlock();
> > @@ -1455,7 +1472,12 @@ static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter,
> >
> > bool kvm_tdp_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
> > {
> > - return kvm_tdp_mmu_handle_gfn(kvm, range, age_gfn_range);
> > + /*
> > + * First TDX generation doesn't support clearing A bit for private
> > + * mapping, since there's no secure EPT API to support it. However
> > + * it's a legitimate request for TDX guest.
> > + */
> > + return kvm_tdp_mmu_handle_gfn(kvm, range, age_gfn_range, true);
> > }
> >
> > static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter,
> > @@ -1466,7 +1488,7 @@ static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter,
> >
> > bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
> > {
> > - return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn);
> > + return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn, false);
>
> The "false" here means we will do young testing for even private
> pages, but we don't have actual A bit state in iter->old_spte for
> them, so may here should be "true" ?

Yes, nice catch.


> > }
> >
> > static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter,
> > @@ -1511,8 +1533,11 @@ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
> > * No need to handle the remote TLB flush under RCU protection, the
> > * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a
> > * shadow page. See the WARN on pfn_changed in __handle_changed_spte().
> > + *
> > + * .change_pte() callback should not happen for private page, because
> > + * for now TDX private pages are pinned during VM's life time.
> > */
>
> Worth to catch this by WARN_ON() ? Depends on you.

It call back can be called for shared pages. Here there is no easy way which
GPA (private or shared) caused it. i.e. no easy condition for WARN_ON().

--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>