Re: [PATCH 1/4] KVM: Always flush async #PF workqueue when vCPU is being destroyed

From: Sean Christopherson
Date: Mon Feb 19 2024 - 10:51:37 EST


On Mon, Feb 19, 2024, Xu Yilun wrote:
> > void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> > @@ -114,7 +132,6 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> > #else
> > if (cancel_work_sync(&work->work)) {
> > mmput(work->mm);
> > - kvm_put_kvm(vcpu->kvm); /* == work->vcpu->kvm */
> > kmem_cache_free(async_pf_cache, work);
> > }
> > #endif
> > @@ -126,7 +143,18 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> > list_first_entry(&vcpu->async_pf.done,
> > typeof(*work), link);
> > list_del(&work->link);
> > - kmem_cache_free(async_pf_cache, work);
> > +
> > + spin_unlock(&vcpu->async_pf.lock);
> > +
> > + /*
> > + * The async #PF is "done", but KVM must wait for the work item
> > + * itself, i.e. async_pf_execute(), to run to completion. If
> > + * KVM is a module, KVM must ensure *no* code owned by the KVM
> > + * (the module) can be run after the last call to module_put(),
> > + * i.e. after the last reference to the last vCPU's file is put.
> > + */
> > + kvm_flush_and_free_async_pf_work(work);
>
> I have a new concern when I re-visit this patchset.
>
> Form kvm_check_async_pf_completion(), I see async_pf.queue is always a
> superset of async_pf.done (except wake-all work, which is not within
> concern). And done work would be skipped from sync (cancel_work_sync()) by:
>
> if (!work->vcpu)
> continue;
>
> But now with this patch we also sync done works, how about we just sync all
> queued work instead.

Hmm, IIUC, I think we can simply revert commit 22583f0d9c85 ("KVM: async_pf: avoid
recursive flushing of work items").