Re: [RESEND PATCH ] KVM: VMX: Enable/disable PML when dirty logging gets enabled/disabled

From: Sean Christopherson
Date: Fri Feb 12 2021 - 16:19:51 EST


On Fri, Feb 12, 2021, Makarand Sonare wrote:
> >> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> >> index 777177ea9a35e..eb6639f0ee7eb 100644
> >> --- a/arch/x86/kvm/vmx/vmx.c
> >> +++ b/arch/x86/kvm/vmx/vmx.c
> >> @@ -4276,7 +4276,7 @@ static void
> >> vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
> >> */
> >> exec_control &= ~SECONDARY_EXEC_SHADOW_VMCS;
> >>
> >> - if (!enable_pml)
> >> + if (!enable_pml || !vcpu->kvm->arch.pml_enabled)
> >> exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
> >
> > The checks are unnecessary if PML is dynamically toggled, i.e. this snippet
> > can
> > unconditionally clear PML. When setting SECONDARY_EXEC (below snippet),
> > PML
> > will be preserved in the current controls, which is what we want.
>
> Assuming a new VCPU can be added at a later time after PML is already
> enabled, should we clear
> PML in VMCS for the new VCPU. If yes what will be the trigger for
> setting PML for the new VCPU?

Ah, didn't consider that. Phooey.

> >> if (cpu_has_vmx_xsaves()) {
> >> @@ -7133,7 +7133,8 @@ static void vmcs_set_secondary_exec_control(struct
> >> vcpu_vmx *vmx)
> >> SECONDARY_EXEC_SHADOW_VMCS |
> >> SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
> >> SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
> >> - SECONDARY_EXEC_DESC;
> >> + SECONDARY_EXEC_DESC |
> >> + SECONDARY_EXEC_ENABLE_PML;
> >>
> >> u32 new_ctl = vmx->secondary_exec_control;
> >> u32 cur_ctl = secondary_exec_controls_get(vmx);
> >> @@ -7509,6 +7510,19 @@ static void vmx_sched_in(struct kvm_vcpu *vcpu, int
> >> cpu)
> >> static void vmx_slot_enable_log_dirty(struct kvm *kvm,
> >> struct kvm_memory_slot *slot)
> >> {
> >> + /*
> >> + * Check all slots and enable PML if dirty logging
> >> + * is being enabled for the 1st slot
> >> + *
> >> + */
> >> + if (enable_pml &&
> >> + kvm->dirty_logging_enable_count == 1 &&
> >> + !kvm->arch.pml_enabled) {
> >> + kvm->arch.pml_enabled = true;
> >> + kvm_make_all_cpus_request(kvm,
> >> + KVM_REQ_UPDATE_VCPU_DIRTY_LOGGING_STATE);
> >> + }

...

> >> @@ -1366,15 +1367,24 @@ int __kvm_set_memory_region(struct kvm *kvm,
> >> }
> >>
> >> /* Allocate/free page dirty bitmap as needed */
> >> - if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES))
> >> + if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES)) {
> >> new.dirty_bitmap = NULL;
> >> - else if (!new.dirty_bitmap && !kvm->dirty_ring_size) {
> >> +
> >> + if (old.flags & KVM_MEM_LOG_DIRTY_PAGES) {
> >> + WARN_ON(kvm->dirty_logging_enable_count == 0);
> >> + --kvm->dirty_logging_enable_count;
> >
> > The count will be corrupted if kvm_set_memslot() fails.
> >
> > The easiest/cleanest way to fix both this and the refcounting bug is to
> > handle the count in kvm_mmu_slot_apply_flags(). That will also allow
> > making the dirty log count x86-only, and it can then be renamed to
> > cpu_dirty_log_count to align with the
> >
> > We can always move/rename the count variable if additional motivation for
> > tracking dirty logging comes along.
>
> Thanks for pointing out. Will this solution take care of the scenario
> where a memslot is created/deleted with LOG_DIRTY_PAGE?

Yes? At least, that's the plan. :-) I'll post my whole series as an RFC later
today so you and Ben can poke holes in my changes. There are some TDP MMU fixes
that I've accumulated and would like to get posted before the 5.12 merge window
opens, if only so that Paolo can make an informed decision on whether or not to
enable TDP MMU by default.