Re: [perf] unchecked MSR access error: WRMSR to 0x689 in intel_pmu_lbr_restore

From: Liang, Kan
Date: Mon Jul 11 2022 - 20:11:27 EST




On 2022-07-11 5:13 p.m., Vince Weaver wrote:
> On Mon, 11 Jul 2022, Liang, Kan wrote:
>
>>
>>
>> On 2022-07-08 12:13 p.m., Vince Weaver wrote:
>>> [ 7763.384369] unchecked MSR access error: WRMSR to 0x689 (tried to write 0x1fffffff8101349e) at rIP: 0xffffffff810704a4 (native_write_msr+0x4/0x20)
>>
>> The 0x689 is a valid LBR register, which is MSR_LASTBRANCH_9_FROM_IP.
>> The issue should be caused by the known TSX bug, which is mentioned in
>> the commit 9fc9ddd61e0 ("perf/x86/intel: Fix MSR_LAST_BRANCH_FROM_x bug
>> when no TSX"). It looks like the TSX support has been deactivated,
>> however the quirk in the commit isn't applied for some reason.
>>
>>
>> To apply the quirk, perf relies on the boot CPU's flag and LBR format.
>>
>> static inline bool lbr_from_signext_quirk_needed(void)
>> {
>> bool tsx_support = boot_cpu_has(X86_FEATURE_HLE) ||
>> boot_cpu_has(X86_FEATURE_RTM);
>>
>> return !tsx_support && x86_pmu.lbr_has_tsx;
>> }
>>
>> Could you please share the value of the PERF_CAPABILITIES MSR 0x00000345
>> of the machine?
>> I'd like to double check whether the LBR fromat is correct. 0x5 is expected.
>
> How would I do that? Just something like:
> # rdmsr 0x00000345
> 32c4
>

Yes. It indicates that the LBR format is 4. That's expected for HSW.
(I made a mistake in the previous email. Skylake has format 5, not HSW.)
For the LBR format 4, the x86_pmu.lbr_has_tsx must be 1.

So it looks like an issue of the CPU flag.

Could you please collect the TSX information which Pawan mentioned in
the other thread?

Thanks,
Kan

> or is it more involved than that?
>
> Vince Weaver
> vincent.weaver@xxxxxxxxx