Re: [PATCH v6] KVM: riscv: Skip CSR restore if VCPU is reloaded on the same core
From: Radim Krčmář
Date: Thu Feb 26 2026 - 10:14:15 EST
2026-02-26T20:38:02+08:00, Jinyu Tang <tjytimi@xxxxxxx>:
> Currently, kvm_arch_vcpu_load() unconditionally restores guest CSRs and
> HGATP. However, when a VCPU is loaded back on the same physical CPU,
> and no other KVM VCPU has run on this CPU since it was last put,
> the hardware CSRs are still valid.
>
> This patch optimizes the vcpu_load path by skipping the expensive CSR
> writes if all the following conditions are met:
> 1. It is being reloaded on the same CPU (vcpu->arch.last_exit_cpu == cpu).
> 2. The CSRs are not dirty (!vcpu->arch.csr_dirty).
> 3. No other VCPU used this CPU (vcpu == __this_cpu_read(kvm_former_vcpu)).
>
> To ensure this fast-path doesn't break corner cases:
> - Live migration and VCPU reset are naturally safe. KVM initializes
> last_exit_cpu to -1, which guarantees the fast-path won't trigger.
> - A new 'csr_dirty' flag is introduced to track runtime userspace
> interventions. If userspace modifies guest configurations (e.g.,
> hedeleg via KVM_SET_GUEST_DEBUG, or CSRs via KVM_SET_ONE_REG) while
> the VCPU is preempted, the flag is set to skip fast path.
>
> Note that kvm_riscv_vcpu_aia_load() is kept outside the skip logic
> to ensure IMSIC/AIA interrupt states are always properly
> synchronized.
>
> Signed-off-by: Jinyu Tang <tjytimi@xxxxxxx>
> ---
> v3 -> v4:
> - Introduced 'csr_dirty' flag to track dynamic userspace CSR/CONFIG
> modifications (KVM_SET_ONE_REG, KVM_SET_GUEST_DEBUG), forcing a full
> restore when debugging or modifying states at userspace.
> - Kept kvm_riscv_vcpu_aia_load() out of the skip block to resolve IMSIC
> VS-file instability.
Excluding AIA is disturbing as we're writing only vsiselect, hviprio1,
and hviprio2... It seems to me that it should be fine to optimize the
AIA CSRs too.
Wasn't the issue that you originally didn't track csr_dirty, and the bug
just manifested through IMSICs?
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> @@ -581,6 +585,20 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr;
> struct kvm_vcpu_config *cfg = &vcpu->arch.cfg;
>
> + /*
> + * If VCPU is being reloaded on the same physical CPU and no
> + * other KVM VCPU has run on this CPU since it was last put,
> + * we can skip the expensive CSR and HGATP writes.
> + *
> + * Note: If a new CSR is added to this fast-path skip block,
> + * make sure that 'csr_dirty' is set to true in any
> + * ioctl (e.g., KVM_SET_ONE_REG) that modifies it.
> + */
> + if (vcpu->arch.last_exit_cpu == cpu && !vcpu->arch.csr_dirty &&
> + vcpu == __this_cpu_read(kvm_former_vcpu))
> + goto csr_restore_done;
I see a small optimization if we set the per-cpu variable here, instead
of doing that in kvm_arch_vcpu_put:
if (vcpu != __this_cpu_read(kvm_former_vcpu))
__this_cpu_write(kvm_former_vcpu, vcpu);
else if (vcpu->arch.last_exit_cpu == cpu && !vcpu->arch.csr_dirty)
goto csr_restore_done;
This means we never have to read the per-cpu twice in the get/put
sequence: faster put at the cost of slightly slower get.
Thanks.