Re: [PATCH v3 5/9] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR

From: Xu, Like
Date: Thu Mar 04 2021 - 21:33:36 EST

Next message: Sami Tolvanen: "Re: [PATCH] KVM: arm64: Disable LTO in hyp"
Previous message: Jiaxun Yang: "Re: [PATCH 2/2] MIPS: Loongson64: Move loongson_system_configuration to loongson.h"
In reply to: Sean Christopherson: "Re: [PATCH v3 5/9] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR"
Next in thread: Like Xu: "[PATCH v3 9/9] KVM: x86: Add XSAVE Support for Architectural LBRs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2021/3/5 0:12, Sean Christopherson wrote:

On Thu, Mar 04, 2021, Xu, Like wrote:

Hi Sean,

Thanks for your detailed review on the patch set.

On 2021/3/4 0:58, Sean Christopherson wrote:

On Wed, Mar 03, 2021, Like Xu wrote:

@@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct kvm_vcpu *vcpu,
return true;
}
+/*
+ * Check if the requested depth values is supported
+ * based on the bits [0:7] of the guest cpuid.1c.eax.
+ */
+static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
+{
+ struct kvm_cpuid_entry2 *best;
+
+ best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
+ if (best && depth && !(depth % 8))

This is still wrong, it fails to weed out depth > 64.

How come ? The testcases depth = {65, 127, 128} get #GP as expected.

@depth is a u64, throw in a number that is a multiple of 8 and >= 520, and the
"(1ULL << (depth / 8 - 1))" will trigger undefined behavior due to shifting
beyond the capacity of a ULL / u64.

Extra:

when we say "undefined behavior" if shifting beyond the capacity of a ULL,
do you mean that the actual behavior depends on the machine, architecture or compiler?

Adding the "< 64" check would also allow dropping the " & 0xff" since the check
would ensure the shift doesn't go beyond bit 7. I'm not sure the cleverness is
worth shaving a cycle, though.

Finally how about:

    if (best && depth && (depth < 65) && !(depth & 7))
        return best->eax & BIT_ULL(depth / 8 - 1);

    return false;

Do you see the room for optimization ？

Not that this is a hot path, but it's probably worth double checking that the
compiler generates simple code for "depth % 8", e.g. it can be "depth & 7)".

Emm, the "%" operation is quite normal over kernel code.

So is "&" :-) I was just pointing out that the compiler should optimize this,
and it did.

if (best && depth && !(depth % 8))
   10659:       48 85 c0                test   rax,rax
   1065c:       74 c7                   je     10625 <intel_pmu_set_msr+0x65>
   1065e:       4d 85 e4                test   r12,r12
   10661:       74 c2                   je     10625 <intel_pmu_set_msr+0x65>
   10663:       41 f6 c4 07             test   r12b,0x7
   10667:       75 bc                   jne    10625 <intel_pmu_set_msr+0x65>

It looks like the compiler does the right thing.
Do you see the room for optimization ？

+ return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));

Actually, looking at this again, I would explicitly use BIT() instead of 1ULL
(or BIT_ULL), since the shift must be 7 or less.

+
+ return false;
+}
+

Next message: Sami Tolvanen: "Re: [PATCH] KVM: arm64: Disable LTO in hyp"
Previous message: Jiaxun Yang: "Re: [PATCH 2/2] MIPS: Loongson64: Move loongson_system_configuration to loongson.h"
In reply to: Sean Christopherson: "Re: [PATCH v3 5/9] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR"
Next in thread: Like Xu: "[PATCH v3 9/9] KVM: x86: Add XSAVE Support for Architectural LBRs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]