Re: [PATCH v5 08/12] KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest

From: Raghavendra Rao Ananta
Date: Wed Aug 23 2023 - 12:06:20 EST


On Tue, Aug 22, 2023 at 3:06 AM Shaoqin Huang <shahuang@xxxxxxxxxx> wrote:
>
> Hi Raghavendra,
>
> On 8/17/23 08:30, Raghavendra Rao Ananta wrote:
> > From: Reiji Watanabe <reijiw@xxxxxxxxxx>
> >
> > KVM does not yet support userspace modifying PMCR_EL0.N (With
> > the previous patch, KVM ignores what is written by upserspace).
> > Add support userspace limiting PMCR_EL0.N.
> >
> > Disallow userspace to set PMCR_EL0.N to a value that is greater
> > than the host value (KVM_SET_ONE_REG will fail), as KVM doesn't
> > support more event counters than the host HW implements.
> > Although this is an ABI change, this change only affects
> > userspace setting PMCR_EL0.N to a larger value than the host.
> > As accesses to unadvertised event counters indices is CONSTRAINED
> > UNPREDICTABLE behavior, and PMCR_EL0.N was reset to the host value
> > on every vCPU reset before this series, I can't think of any
> > use case where a user space would do that.
> >
> > Also, ignore writes to read-only bits that are cleared on vCPU reset,
> > and RES{0,1} bits (including writable bits that KVM doesn't support
> > yet), as those bits shouldn't be modified (at least with
> > the current KVM).
> >
> > Signed-off-by: Reiji Watanabe <reijiw@xxxxxxxxxx>
> > Signed-off-by: Raghavendra Rao Ananta <rananta@xxxxxxxxxx>
> > ---
> > arch/arm64/include/asm/kvm_host.h | 3 ++
> > arch/arm64/kvm/pmu-emul.c | 1 +
> > arch/arm64/kvm/sys_regs.c | 49 +++++++++++++++++++++++++++++--
> > 3 files changed, 51 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 0f2dbbe8f6a7e..c15ec365283d1 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -259,6 +259,9 @@ struct kvm_arch {
> > /* PMCR_EL0.N value for the guest */
> > u8 pmcr_n;
> >
> > + /* Limit value of PMCR_EL0.N for the guest */
> > + u8 pmcr_n_limit;
> > +
> > /* Hypercall features firmware registers' descriptor */
> > struct kvm_smccc_features smccc_feat;
> > struct maple_tree smccc_filter;
> > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> > index ce7de6bbdc967..39ad56a71ad20 100644
> > --- a/arch/arm64/kvm/pmu-emul.c
> > +++ b/arch/arm64/kvm/pmu-emul.c
> > @@ -896,6 +896,7 @@ int kvm_arm_set_vm_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
> > * while the latter does not.
> > */
> > kvm->arch.pmcr_n = arm_pmu->num_events - 1;
> > + kvm->arch.pmcr_n_limit = arm_pmu->num_events - 1;
> >
> > return 0;
> > }
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index 2075901356c5b..c01d62afa7db4 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -1086,6 +1086,51 @@ static int get_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> > return 0;
> > }
> >
> > +static int set_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r,
> > + u64 val)
> > +{
> > + struct kvm *kvm = vcpu->kvm;
> > + u64 new_n, mutable_mask;
> > + int ret = 0;
> > +
> > + new_n = FIELD_GET(ARMV8_PMU_PMCR_N, val);
> > +
> > + mutex_lock(&kvm->arch.config_lock);
> > + if (unlikely(new_n != kvm->arch.pmcr_n)) {
> > + /*
> > + * The vCPU can't have more counters than the PMU
> > + * hardware implements.
> > + */
> > + if (new_n <= kvm->arch.pmcr_n_limit)
> > + kvm->arch.pmcr_n = new_n;
> > + else
> > + ret = -EINVAL;
> > + }
> > + mutex_unlock(&kvm->arch.config_lock);
>
> Another thing I am just wonder is that should we block any modification
> to the pmcr_n after vm start to run? Like add one more checking
> kvm_vm_has_ran_once() at the beginning of the set_pmcr() function.
>
Thanks for bringing it up. Reiji and I discussed about this. Checking
for kvm_vm_has_ran_once() will be a good move, however, it will go
against the ABI expectations of setting the PMCR. I'd like others to
weigh in on this as well. What do you think?

Thank you.
Raghavendra
> Thanks,
> Shaoqin
>
> > + if (ret)
> > + return ret;
> > +
> > + /*
> > + * Ignore writes to RES0 bits, read only bits that are cleared on
> > + * vCPU reset, and writable bits that KVM doesn't support yet.
> > + * (i.e. only PMCR.N and bits [7:0] are mutable from userspace)
> > + * The LP bit is RES0 when FEAT_PMUv3p5 is not supported on the vCPU.
> > + * But, we leave the bit as it is here, as the vCPU's PMUver might
> > + * be changed later (NOTE: the bit will be cleared on first vCPU run
> > + * if necessary).
> > + */
> > + mutable_mask = (ARMV8_PMU_PMCR_MASK | ARMV8_PMU_PMCR_N);
> > + val &= mutable_mask;
> > + val |= (__vcpu_sys_reg(vcpu, r->reg) & ~mutable_mask);
> > +
> > + /* The LC bit is RES1 when AArch32 is not supported */
> > + if (!kvm_supports_32bit_el0())
> > + val |= ARMV8_PMU_PMCR_LC;
> > +
> > + __vcpu_sys_reg(vcpu, r->reg) = val;
> > + return 0;
> > +}
> > +
> > /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
> > #define DBG_BCR_BVR_WCR_WVR_EL1(n) \
> > { SYS_DESC(SYS_DBGBVRn_EL1(n)), \
> > @@ -2147,8 +2192,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> > { SYS_DESC(SYS_CTR_EL0), access_ctr },
> > { SYS_DESC(SYS_SVCR), undef_access },
> >
> > - { PMU_SYS_REG(PMCR_EL0), .access = access_pmcr,
> > - .reset = reset_pmcr, .reg = PMCR_EL0, .get_user = get_pmcr },
> > + { PMU_SYS_REG(PMCR_EL0), .access = access_pmcr, .reset = reset_pmcr,
> > + .reg = PMCR_EL0, .get_user = get_pmcr, .set_user = set_pmcr },
> > { PMU_SYS_REG(PMCNTENSET_EL0),
> > .access = access_pmcnten, .reg = PMCNTENSET_EL0 },
> > { PMU_SYS_REG(PMCNTENCLR_EL0),
>
> --
> Shaoqin
>