Re: [PATCH v2] KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVM

From: Fuad Tabba
Date: Thu Sep 12 2024 - 10:22:32 EST


Hi Mark,

On Thu, 12 Sept 2024 at 12:39, Mark Brown <broonie@xxxxxxxxxx> wrote:
>
> When pKVM saves and restores the host floating point state on a SVE system
> it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL
> for the PE is but uses a buffer allocated with kvm_host_sve_max_vl, the
> maximum VL shared by all PEs in the system. This means that if we run on a
> system where the maximum VLs are not consistent we will overflow the buffer
> on PEs which support larger VLs.
>
> Since the host will not currently attempt to make use of non-shared VLs fix
> this by explicitly setting the EL2 VL to be the maximum shared VL when we
> save and restore. This will enforce the limit on host VL usage. Should we
> wish to support asymmetric VLs this code will need to be updated along with
> the required changes for the host, patches have previously been posted:
>
> https://lore.kernel.org/r/20240730-kvm-arm64-fix-pkvm-sve-vl-v6-0-cae8a2e0bd66@xxxxxxxxxx

Thanks for fixing this.

One part that you haven't changed is setting ZCR_EL2 during el2 setup:
arch/arm64/include/asm/el2_setup.h: .Linit_sve_ : lines 290/291

I guess at that point it's not straightforward to figure sve_max_vl.
Is there a window after el2 setup where we might actually get the VL
implied by ZCR_ELx_LEN_MASK, or would it always get set to
sve_vq_from_vl(kvm_host_sve_max_vl) - 1 ?

That said, this passes the sve-stress test now: tested on qemu using
nvhe, pKVM non-protected guest, pKVM protected guest (the latest with
patches not upstream yet).

Assuming that the current code in el2_setup.h is fine as it is:
Tested-by: Fuad Tabba <tabba@xxxxxxxxxx>
Reviewed-by: Fuad Tabba <tabba@xxxxxxxxxx>

Cheers,
/fuad



>
> Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
> Signed-off-by: Mark Brown <broonie@xxxxxxxxxx>
> ---
> Changes in v2:
> - Update all places where we constrain the host VL, not just those where
> we save and restore host state.
> - The value written to the register is 0 based, not 1 based.
> - Link to v1: https://lore.kernel.org/r/20240910-kvm-arm64-limit-guest-vl-v1-1-54df40b95ffb@xxxxxxxxxx
> ---
> arch/arm64/kvm/hyp/include/hyp/switch.h | 2 +-
> arch/arm64/kvm/hyp/nvhe/hyp-main.c | 12 +++++++-----
> 2 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index f59ccfe11ab9..c2cfb4d6dc92 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -339,7 +339,7 @@ static inline void __hyp_sve_save_host(void)
> struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
>
> sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR);
> - write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> + write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
> __sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
> &sve_state->fpsr,
> true);
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index f43d845f3c4e..dd1c6aa907a2 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -33,7 +33,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
> */
> sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
> __sve_save_state(vcpu_sve_pffr(vcpu), &vcpu->arch.ctxt.fp_regs.fpsr, true);
> - write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> + write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
> }
>
> static void __hyp_sve_restore_host(void)
> @@ -45,10 +45,11 @@ static void __hyp_sve_restore_host(void)
> * the host. The layout of the data when saving the sve state depends
> * on the VL, so use a consistent (i.e., the maximum) host VL.
> *
> - * Setting ZCR_EL2 to ZCR_ELx_LEN_MASK sets the effective length
> - * supported by the system (or limited at EL3).
> + * Note that this constrains the PE to the maximum shared VL
> + * that was discovered, if we wish to use larger VLs this will
> + * need to be revisited.
> */
> - write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> + write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
> __sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
> &sve_state->fpsr,
> true);
> @@ -479,7 +480,8 @@ void handle_trap(struct kvm_cpu_context *host_ctxt)
> case ESR_ELx_EC_SVE:
> cpacr_clear_set(0, CPACR_ELx_ZEN);
> isb();
> - sve_cond_update_zcr_vq(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> + sve_cond_update_zcr_vq(sve_vq_from_vl(kvm_host_sve_max_vl) - 1,
> + SYS_ZCR_EL2);
> break;
> case ESR_ELx_EC_IABT_LOW:
> case ESR_ELx_EC_DABT_LOW:
>
> ---
> base-commit: 7c626ce4bae1ac14f60076d00eafe71af30450ba
> change-id: 20240910-kvm-arm64-limit-guest-vl-d5fba0c7cc7b
>
> Best regards,
> --
> Mark Brown <broonie@xxxxxxxxxx>
>