Re: [PATCH v2 1/6] KVM: x86/mmu: Fix wrong gfn range of tlb flushing in validate_direct_spte()

From: David Matlack
Date: Tue Sep 20 2022 - 14:44:53 EST


On Tue, Sep 20, 2022 at 11:32 AM David Matlack <dmatlack@xxxxxxxxxx> wrote:
>
> On Sun, Sep 18, 2022 at 09:11:00PM +0800, Robert Hoo wrote:
> > On Wed, 2022-08-24 at 17:29 +0800, Hou Wenlong wrote:
> > > The spte pointing to the children SP is dropped, so the
> > > whole gfn range covered by the children SP should be flushed.
> > > Although, Hyper-V may treat a 1-page flush the same if the
> > > address points to a huge page, it still would be better
> > > to use the correct size of huge page. Also introduce
> > > a helper function to do range-based flushing when a direct
> > > SP is dropped, which would help prevent future buggy use
> > > of kvm_flush_remote_tlbs_with_address() in such case.
> > >
> > > Fixes: c3134ce240eed ("KVM: Replace old tlb flush function with new
> > > one to flush a specified range.")
> > > Suggested-by: David Matlack <dmatlack@xxxxxxxxxx>
> > > Signed-off-by: Hou Wenlong <houwenlong.hwl@xxxxxxxxxxxx>
> > > ---
> > > arch/x86/kvm/mmu/mmu.c | 10 +++++++++-
> > > 1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index e418ef3ecfcb..a3578abd8bbc 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -260,6 +260,14 @@ void kvm_flush_remote_tlbs_with_address(struct
> > > kvm *kvm,
> > > kvm_flush_remote_tlbs_with_range(kvm, &range);
> > > }
> > >
> > > +/* Flush all memory mapped by the given direct SP. */
> > > +static void kvm_flush_remote_tlbs_direct_sp(struct kvm *kvm, struct
> > > kvm_mmu_page *sp)
> > > +{
> > > + WARN_ON_ONCE(!sp->role.direct);
> >
> > What if !sp->role.direct? Below flushing sp->gfn isn't expected? but
> > still to do it. Is this operation harmless?
>
> Flushing TLBs is always harmless because KVM cannot ever assume an entry is
> in the TLB. However, *not* (properly) flushing TLBs can be harmful. If KVM ever
> calls kvm_flush_remote_tlbs_direct_sp() with an indirect SP, that is a bug in
> KVM. The TLB flush here won't be harmful, as I explained, but KVM will miss a
> TLB flush.
>
> That being said, I don't think any changes here are necessary.
> kvm_flush_remote_tlbs_direct_sp() only has one caller, validate_direct_spte(),
> which only operates on direct SPs. The name of the function also makes it
> obvious this should only be called with a direct SP. And if we ever mess this
> up in the future, we'll see the WARN_ON().

That being said, we might as well replace the WARN_ON_ONCE() with
KVM_BUG_ON(). That will still do a WARN_ON_ONCE() but has the added
benefit of terminating the VM.