Re: [PATCH] KVM: x86/vPMU: Forbid writing to MSR_F15H_PERF MSRs when guest doesn't have X86_FEATURE_PERFCTR_CORE

From: Vitaly Kuznetsov
Date: Thu Mar 25 2021 - 04:11:25 EST


Haiwei Li <lihaiwei.kernel@xxxxxxxxx> writes:

> On Tue, Mar 23, 2021 at 4:48 PM Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> wrote:
>>
>> MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
>> X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
>> allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
>> amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
>> amd_pmu_set_msr() doesn't fail.
>>
>> In case of a counter (CTRn), no big harm is done as we only increase
>> internal PMC's value but in case of an eventsel (CTLn), we go deep into
>> perf internals with a non-existing counter.
>>
>> Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
>> and this also seems to contradict architectural behavior which is #GP
>> (I did check one old Opteron host) but changing this status quo is a bit
>> scarier.
>
> When msr doesn't exist, kvm_get_msr_common() returns KVM_MSR_RET_INVALID
> in `default:` and kvm_complete_insn_gp() will inject #GP to guest.
>

I'm looking at the following in kvm_get_msr_common():

switch (msr_info->index) {
...
case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
...
if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
return kvm_pmu_get_msr(vcpu, msr_info);
msr_info->data = 0;
break;
...
}
return 0;

so it's kind of 'always exists' or am I wrong?

> Also i have wrote a kvm-unit-test, tested both on amd EPYC and intel
> CascadeLake. A #GP error was printed.
> Just like:
>
> Unhandled exception 13 #GP at ip 0000000000400420
> error_code=0000 rflags=00010006 cs=00000008
> rax=0000000000000000 rcx=0000000000000620 rdx=00000000006164a0
> rbx=0000000000009500
> rbp=0000000000517490 rsi=0000000000616ae0 rdi=0000000000000001
> r8=0000000000000001 r9=00000000000003f8 r10=000000000000000d
> r11=0000000000000000
> r12=0000000000000000 r13=0000000000000000 r14=0000000000000000
> r15=0000000000000000
> cr0=0000000080000011 cr2=0000000000000000 cr3=000000000040b000
> cr4=0000000000000020
> cr8=0000000000000000
> STACK: @400420 400338

Did this happen on read or write? The later is expected, the former is
not. Could you maybe drop your code here, I'd like to see what's going
on.

--
Vitaly