Re: [PATCH v5 15/19] KVM: Provide common implementation for generic dirty log functions

From: Sean Christopherson
Date: Fri Feb 07 2020 - 20:29:59 EST


On Fri, Feb 07, 2020 at 07:53:34PM -0500, Peter Xu wrote:
> On Fri, Feb 07, 2020 at 04:42:33PM -0800, Sean Christopherson wrote:
> > On Fri, Feb 07, 2020 at 07:18:32PM -0500, Peter Xu wrote:
> > > On Fri, Feb 07, 2020 at 11:45:32AM -0800, Sean Christopherson wrote:
> > > > +Vitaly for HyperV
> > > >
> > > > On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote:
> > > > > On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote:
> > > > > > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote:
> > > > > > > But that matters to this patch because if MIPS can use
> > > > > > > kvm_flush_remote_tlbs(), then we probably don't need this
> > > > > > > arch-specific hook any more and we can directly call
> > > > > > > kvm_flush_remote_tlbs() after sync dirty log when flush==true.
> > > > > >
> > > > > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing
> > > > > > that prevents calling kvm_flush_remote_tlbs() directly, but I have no
> > > > > > clue as to the important of that code.
> > > > >
> > > > > As said above I think the x86 lockdep is really not necessary, then
> > > > > considering MIPS could be the only one that will use the new hook
> > > > > introduced in this patch... Shall we figure that out first?
> > > >
> > > > So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a
> > > > MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86,
> > > > but then I realized x86 *has* a hook to do a precise remote TLB flush.
> > > > There's even an existing kvm_flush_remote_tlbs_with_address() call on a
> > > > memslot, i.e. this exact scenario. So arguably, x86 should be using the
> > > > more precise flush and should keep kvm_arch_dirty_log_tlb_flush().
> > > >
> > > > But, the hook is only used when KVM is running as an L1 on top of HyperV,
> > > > and I assume dirty logging isn't used much, if at all, for L1 KVM on
> > > > HyperV?
> > > >
> > > > I see three options:
> > > >
> > > > 1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call
> > > > kvm_flush_remote_tlbs() directly for arm and x86. Add comments to
> > > > explain when an arch should implement kvm_arch_dirty_log_tlb_flush().
> > > >
> > > > 2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing
> > > > a memslot after the dirty log is grabbed by userspace.
> > > >
> > > > 3. Keep the resulting code as is, but add a comment in x86's
> > > > kvm_arch_dirty_log_tlb_flush() to explain why it uses
> > > > kvm_flush_remote_tlbs() instead of the with_address() variant.
> > > >
> > > > I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of
> > > > those is preferable.
> > > >
> > > > I don't like (1) because (a) it requires more lines code (well comments),
> > > > to explain why kvm_flush_remote_tlbs() is the default, and (b) it would
> > > > require even more comments, which would be x86-specific in generic KVM,
> > > > to explain why x86 doesn't use its with_address() flush, or we'd lost that
> > > > info altogether.
> > > >
> > >
> > > I proposed the 4th solution here:
> > >
> > > https://lore.kernel.org/kvm/20200207223520.735523-1-peterx@xxxxxxxxxx/
> > >
> > > I'm not sure whether that's acceptable, but if it can, then we can
> > > drop the kvm_arch_dirty_log_tlb_flush() hook, or even move on to
> > > per-slot tlb flushing.
> >
> > This effectively is per-slot TLB flushing, it just has a different name.
> > I.e. s/kvm_arch_dirty_log_tlb_flush/kvm_arch_flush_remote_tlbs_memslot.
> > I'm not opposed to that name change. And on second and third glance, I
> > probably prefer it. That would more or less follow the naming of
> > kvm_arch_flush_shadow_all() and kvm_arch_flush_shadow_memslot().
>
> Note that the major point of the above patchset is not about doing tlb
> flush per-memslot or globally. It's more about whether we can provide
> a common entrance for TLB flushing. Say, after that series, we should
> be able to flush TLB on all archs (majorly, including MIPS) as:
>
> kvm_flush_remote_tlbs(kvm);
>
> And with the same idea we can also introduce the ranged version.
>
> >
> > I don't want to go straight to kvm_arch_flush_remote_tlb_with_address()
> > because that loses the important distinction (on x86) that slots_lock is
> > expected to be held.
>
> Sorry I'm still puzzled on why that lockdep is so important and
> special for x86... For example, what if we move that lockdep to the
> callers of the kvm_arch_dirty_log_tlb_flush() calls so it protects
> even more arch (where we do get/clear dirty log)? IMHO the callers
> must be with the slots_lock held anyways no matter for x86 or not.


Following the breadcrumbs leads to the comment in
kvm_mmu_slot_remove_write_access(), which says:

/*
* kvm_mmu_slot_remove_write_access() and kvm_vm_ioctl_get_dirty_log()
* which do tlb flush out of mmu-lock should be serialized by
* kvm->slots_lock otherwise tlb flush would be missed.
*/

I.e. write-protecting a memslot and grabbing the dirty log for the memslot
need to be serialized. It's quite obvious *now* that get_dirty_log() holds
slots_lock, but the purpose of lockdep assertions isn't just to verify the
current functionality, it's to help ensure the correctness for future code
and to document assumptions in the code.

Digging deeper, there are four functions, all related to dirty logging, in
the x86 mmu that basically open code what x86's
kvm_arch_flush_remote_tlbs_memslot() would look like if it uses the range
based flushing.

Unless it's functionally incorrect (Vitaly?), going with option (2) and
naming the hook kvm_arch_flush_remote_tlbs_memslot() seems like the obvious
choice, e.g. the final cleanup gives this diff stat:

arch/x86/kvm/mmu/mmu.c | 34 +++++++++-------------------------
1 file changed, 9 insertions(+), 25 deletions(-)