Re: [PATCH v4] KVM: riscv: Skip CSR restore if VCPU is reloaded on the same core

From: Andrew Jones

Date: Tue Feb 24 2026 - 16:51:09 EST


On Sun, Feb 22, 2026 at 12:57:41PM +0800, Jinyu Tang wrote:
> Currently, kvm_arch_vcpu_load() unconditionally restores guest CSRs and
> HGATP. However, when a preempted VCPU is scheduled back on the same
> physical CPU, and no other KVM VCPU has run on this CPU in the meantime,
> the hardware CSRs are still valid.
>
> This patch optimizes the vcpu_load path by skipping the expensive CSR
> writes if all the following conditions are met:
> 1. The VCPU was previously preempted (vcpu->scheduled_out == 1).
> 2. It is being reloaded on the same CPU (vcpu->arch.last_exit_cpu == cpu).
> 3. No other VCPU used this CPU (vcpu == __this_cpu_read(kvm_former_vcpu)).
> 4. The CSRs are not dirty (!vcpu->arch.csr_dirty).
>
> To ensure this fast-path doesn't break corner cases:
> - Live migration and VCPU reset are naturally safe. KVM initializes
> last_exit_cpu to -1, which guarantees the fast-path won't trigger.
> - A new 'csr_dirty' flag is introduced to track runtime userspace
> interventions. If userspace modifies guest configurations (e.g.,
> hedeleg via KVM_SET_GUEST_DEBUG, or CSRs via KVM_SET_ONE_REG) while
> the VCPU is preempted, the flag is set to skip fast path.

I'm a bit concerned that the need for csr_dirty makes this fragile. Maybe
some reorganizing/factoring of kvm_arch_vcpu_load() could be done in a way
to make it obvious to anybody that's adding another csr to be loaded that
they also need to ensure csr_dirty is set when appropriate? At least a big
comment would help.

That said, can we now apply csr_dirty everywhere CSRs are dirtied allowing
us to remove the vcpu->scheduled_out check? Unless I'm missing something
we should only need to ensure last-cpu and last-vcpu are as expected and
that no csrs are dirty in order to skip the writes - it shouldn't matter
that we're reloading after a sched out or otherwise.

A couple nits below.

>
> Note that kvm_riscv_vcpu_aia_load() is kept outside the skip logic
> to ensure IMSIC/AIA interrupt states are always properly
> synchronized.
>
> Signed-off-by: Jinyu Tang <tjytimi@xxxxxxx>
> ---
> v3 -> v4:
> - Addressed Anup Patel's review regarding hardware state inconsistency.
> - Introduced 'csr_dirty' flag to track dynamic userspace CSR/CONFIG
> modifications (KVM_SET_ONE_REG, KVM_SET_GUEST_DEBUG), forcing a full
> restore when debugging or modifying states at userspace.
> - Kept kvm_riscv_vcpu_aia_load() out of the skip block to resolve IMSIC
> VS-file instability.
>
> v2 -> v3:
> v2 was missing a critical check because I generated the patch from my
> wrong (experimental) branch. This is fixed in v3. Sorry for my trouble.
>
> v1 -> v2:
> Apply the logic to aia csr load. Thanks for Andrew Jones's advice.
> ---
> arch/riscv/include/asm/kvm_host.h | 3 +++
> arch/riscv/kvm/vcpu.c | 13 +++++++++++++
> arch/riscv/kvm/vcpu_onereg.c | 3 +++
> 3 files changed, 19 insertions(+)
>
> diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
> index 24585304c..7ee47b83c 100644
> --- a/arch/riscv/include/asm/kvm_host.h
> +++ b/arch/riscv/include/asm/kvm_host.h
> @@ -273,6 +273,9 @@ struct kvm_vcpu_arch {
> /* 'static' configurations which are set only once */
> struct kvm_vcpu_config cfg;
>
> + /* Indicates modified guest CSRs */
> + bool csr_dirty;
> +
> /* SBI steal-time accounting */
> struct {
> gpa_t shmem;
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index a55a95da5..f7f58f02c 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -24,6 +24,8 @@
> #define CREATE_TRACE_POINTS
> #include "trace.h"
>
> +static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_former_vcpu);
> +
> const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
> KVM_GENERIC_VCPU_STATS(),
> STATS_DESC_COUNTER(VCPU, ecall_exit_stat),
> @@ -537,6 +539,9 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
> vcpu->arch.cfg.hedeleg |= BIT(EXC_BREAKPOINT);
> }
>
> + /* Mark CSRs dirty on hedeleg update */

Unnecessary comment.

> + vcpu->arch.csr_dirty = true;
> +
> return 0;
> }
>
> @@ -581,6 +586,11 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr;
> struct kvm_vcpu_config *cfg = &vcpu->arch.cfg;
>
> + if (vcpu->scheduled_out && vcpu == __this_cpu_read(kvm_former_vcpu) &&
> + vcpu->arch.last_exit_cpu == cpu && !vcpu->arch.csr_dirty)

Align the second line condition under the first

if (vcpu->scheduled_out && vcpu == __this_cpu_read(kvm_former_vcpu) &&
vcpu->arch.last_exit_cpu == cpu && !vcpu->arch.csr_dirty)

> + goto csr_restore_done;
> +
> + vcpu->arch.csr_dirty = false;
> if (kvm_riscv_nacl_sync_csr_available()) {
> nsh = nacl_shmem();
> nacl_csr_write(nsh, CSR_VSSTATUS, csr->vsstatus);
> @@ -624,6 +634,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>
> kvm_riscv_mmu_update_hgatp(vcpu);
>
> +csr_restore_done:
> kvm_riscv_vcpu_timer_restore(vcpu);
>
> kvm_riscv_vcpu_host_fp_save(&vcpu->arch.host_context);
> @@ -645,6 +656,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> void *nsh;
> struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr;
>
> + __this_cpu_write(kvm_former_vcpu, vcpu);
> +
> vcpu->cpu = -1;
>
> kvm_riscv_vcpu_aia_put(vcpu);
> diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c
> index e7ab6cb00..88cfcb018 100644
> --- a/arch/riscv/kvm/vcpu_onereg.c
> +++ b/arch/riscv/kvm/vcpu_onereg.c
> @@ -652,6 +652,9 @@ static int kvm_riscv_vcpu_set_reg_csr(struct kvm_vcpu *vcpu,
> if (rc)
> return rc;
>
> + /* Mark CSRs dirty after userspace update csr */

Unnecessary comment.

> + vcpu->arch.csr_dirty = true;
> +
> return 0;
> }
>
> --
> 2.43.0
>

Thanks,
drew