Re: [PATCH v2 3/4] Add support for AMD Core Perf Extension in guest

From: Borislav Petkov
Date: Thu Nov 09 2017 - 13:34:38 EST



> Subject: Re: [PATCH v2 3/4] Add support for AMD Core Perf Extension in guest

Btw, your subjects need a prefix:

"x86/kvm: Add guest support for the AMD core performance counters"

for example.

On Mon, Nov 06, 2017 at 11:44:25AM -0600, Janakarajan Natarajan wrote:
> This patch adds support for AMD Core Performance counters in the guest.

Never say "This patch" in the commit message of a patch. It is
tautologically useless.

> The base event select and counter MSRs are changed. In addition, with
> the core extension, there are 2 extra counters available for performance
> measurements for a total of 6.
>
> With the new MSRs, the logic to map them to the gp_counters[] is changed.
> New functions are introduced to get the right base MSRs and to check the
> validity of the get/set MSRs.
>
> If the guest has vcpus of either family 16h or a generation < 15h, it

You're only talking about families here, the "generation" thing is
confusing.

> falls back to using K7 MSRs and the number of counters the guest can
> access is set to 4.
>
> Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@xxxxxxx>
> ---
> arch/x86/kvm/pmu_amd.c | 133 +++++++++++++++++++++++++++++++++++++++++++------
> arch/x86/kvm/x86.c | 1 +
> 2 files changed, 120 insertions(+), 14 deletions(-)

...

> +static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
> + enum pmu_type type)
> +{
> + unsigned int base = get_msr_base(pmu, type);
> +
> + if (base == MSR_F15H_PERF_CTL) {
> + switch (msr) {
> + case MSR_F15H_PERF_CTL0:
> + case MSR_F15H_PERF_CTL1:
> + case MSR_F15H_PERF_CTL2:
> + case MSR_F15H_PERF_CTL3:
> + case MSR_F15H_PERF_CTL4:
> + case MSR_F15H_PERF_CTL5:
> + /*
> + * AMD Perf Extension MSRs are not continuous.
> + *
> + * E.g. MSR_F15H_PERF_CTR0 -> 0xc0010201
> + * MSR_F15H_PERF_CTR1 -> 0xc0010203
> + *
> + * These are mapped to work with gp_counters[].
> + * The index into the array is calculated by
> + * dividing the difference between the requested
> + * msr and the msr base by 2.
> + *
> + * E.g. MSR_F15H_PERF_CTR1 uses
> + * ->gp_counters[(0xc0010203-0xc0010201)/2]
> + * ->gp_counters[1]
> + */
> + return &pmu->gp_counters[(msr - base) >> 1];

Ok, it took me a bit of staring to understand what you're doing here.
And frankly, this scheme is silly and fragile. You're relying on the
fact that you can do math with the MSR numbers to get you the GP counter
number. The moment that changes in future families, you are going to
have to devise a new scheme for the new family.

And instead of doing that, you're much better off producing a simple
MSR -> counter mapping for each family which is a simple switch-case.
No need to do get_msr_base() and whatnot - you simply feed in the MSR
number and the function spits out a gp_counters index. You only need to
check the family of the vcpu.

Then lookers will be able to understand the code at a quick glance too.

...

> @@ -153,8 +251,15 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
> {
> struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> + int family, nr_counters;
> +
> + family = guest_cpuid_family(vcpu);
> + if (family == 0x15 || family == 0x17)
> + nr_counters = AMD64_NUM_COUNTERS_CORE;
> + else
> + nr_counters = AMD64_NUM_COUNTERS;
>
> - pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
> + pmu->nr_arch_gp_counters = nr_counters;
> pmu->counter_bitmask[KVM_PMC_GP] = ((u64)1 << 48) - 1;
> pmu->reserved_bits = 0xffffffff00200000ull;
> /* not applicable to AMD; but clean them to prevent any fall out */
> @@ -169,7 +274,7 @@ static void amd_pmu_init(struct kvm_vcpu *vcpu)
> struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> int i;
>
> - for (i = 0; i < AMD64_NUM_COUNTERS ; i++) {
> + for (i = 0; i < AMD64_NUM_COUNTERS_CORE ; i++) {

This all works because INTEL_PMC_MAX_GENERIC is bigger than the AMD num
counters but you need to check all that.

Also, the finding out of the nr_counters you do in amd_pmu_refresh()
should happen here, in the init function so that you have
pmu->nr_arch_gp_counters properly set and then when you iterate over
counters in the remaining functions, you do:

for (i = 0; i < pmu->nr_arch_gp_counters ; i++) {

instead of using those defines which are not always correct, depending
on the family.

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix ImendÃrffer, Jane Smithard, Graham Norton, HRB 21284 (AG NÃrnberg)
--