Re: [RFC PATCH 09/13] KVM: nSVM: Handle nested TLB flush requests through TLB_CONTROL
From: Yosry Ahmed
Date: Wed Feb 05 2025 - 16:46:06 EST
On Wed, Feb 05, 2025 at 06:23:58PM +0000, Yosry Ahmed wrote:
> Handle L1's requests to flush L2's TLB through the TLB_CONTROL field of
> VMCB12. This is currently redundant because a full flush is executed on
> every nested transition, but is a step towards removing that.
>
> TLB_CONTROL_FLUSH_ALL_ASID flushes all ASIDs from L1's perspective,
> including its own, so do a guest TLB flush on both transitions. Never
> propagate TLB_CONTROL_FLUSH_ALL_ASID from the guest to the actual VMCB,
> as this gives the guest the power to flush the entire physical TLB
> (including translations for the host and other VMs).
>
> For other cases, the TLB flush is only done when entering L2. The nested
> NPT MMU is also sync'd because TLB_CONTROL also flushes NPT
> guest-physical mappings.
>
> All TLB_CONTROL values can be handled by KVM regardless of FLUSHBYASID
> support on the underlying CPU, so keep advertising FLUSHBYASID to the
> guest unconditionally.
>
> Signed-off-by: Yosry Ahmed <yosry.ahmed@xxxxxxxxx>
> ---
> arch/x86/kvm/svm/nested.c | 42 ++++++++++++++++++++++++++++++++-------
> arch/x86/kvm/svm/svm.c | 6 +++---
> 2 files changed, 38 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 0735177b95a1d..e2c59eb2907e8 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -484,19 +484,36 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm,
>
> static void nested_svm_entry_tlb_flush(struct kvm_vcpu *vcpu)
> {
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> /* Handle pending Hyper-V TLB flush requests */
> kvm_hv_nested_transtion_tlb_flush(vcpu, npt_enabled);
>
> + /*
> + * If L1 requested a TLB flush for L2, flush L2's TLB on nested entry
> + * and sync the nested NPT MMU, as TLB_CONTROL also flushes NPT
> + * guest-physical mappings. We technically only need to flush guest_mode
> + * page tables.
> + *
> + * KVM_REQ_TLB_FLUSH_GUEST will flush L2's ASID even if the underlying
> + * CPU does not support FLUSHBYASID (by assigning a new ASID), so we
> + * can handle all TLB_CONTROL values from L1 regardless.
> + *
> + * Note that TLB_CONTROL_FLUSH_ASID_LOCAL is handled exactly like
> + * TLB_CONTROL_FLUSH_ASID. We can technically flush less TLB entries,
> + * but this would require significantly more complexity.
> + */
> + if (svm->nested.ctl.tlb_ctl != TLB_CONTROL_DO_NOTHING) {
> + if (nested_npt_enabled(svm))
> + kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
> + kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu);
> + }
> +
> /*
> * TODO: optimize unconditional TLB flush/MMU sync. A partial list of
> * things to fix before this can be conditional:
> *
> - * - Honor L1's request to flush an ASID on nested VMRUN
> - * - Sync nested NPT MMU on VMRUN that flushes L2's ASID[*]
> * - Don't crush a pending TLB flush in vmcb02 on nested VMRUN
> - *
> - * [*] Unlike nested EPT, SVM's ASID management can invalidate nested
> - * NPT guest-physical mappings on VMRUN.
> */
> kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
> kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
> @@ -504,9 +521,18 @@ static void nested_svm_entry_tlb_flush(struct kvm_vcpu *vcpu)
>
> static void nested_svm_exit_tlb_flush(struct kvm_vcpu *vcpu)
> {
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> /* Handle pending Hyper-V TLB flush requests */
> kvm_hv_nested_transtion_tlb_flush(vcpu, npt_enabled);
>
> + /*
> + * If L1 had requested a full TLB flush when entering L2, also flush its
> + * TLB entries when exiting back to L1.
> + */
> + if (svm->nested.ctl.tlb_ctl == TLB_CONTROL_FLUSH_ALL_ASID)
> + kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu);
> +
> /* See nested_svm_entry_tlb_flush() */
> kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
> kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
> @@ -825,7 +851,8 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa,
> nested_svm_copy_common_state(svm->vmcb01.ptr, svm->nested.vmcb02.ptr);
>
> svm_switch_vmcb(svm, &svm->nested.vmcb02);
> - nested_vmcb02_prepare_control(svm, vmcb12->save.rip, vmcb12->save.cs.base);
> + nested_vmcb02_prepare_control(svm, vmcb12->save.rip,
> + vmcb12->save.cs.base);
These unrelated changes were unintentional, please ignore them.