Re: [PATCH] perf/x86/intel/lbr: fix branch type encoding
From: Liang, Kan
Date: Sun Aug 14 2022 - 15:44:43 EST
On 2022-08-12 4:16 a.m., Andi Kleen wrote:
>
>>
>> I think the option is to avoid the overhead of disassembling of branch
>> instruction. See eb0baf8a0d92 ("perf/core: Define the common branch type
>> classification")
>> "Since the disassembling of branch instruction needs some overhead,
>> a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
>> needs to disassemble the branch instruction and record the branch
>> type."
>
>
> Thanks for digging it out. So it was only performance.
>
>>
>> I have no idea how big the overhead is. If we can always be benefit from
>> the branch type. I guess we can make it default on.
>
> I thought even arch LBR had one case where it had to disassemble, but
> perhaps it's unlikely enough because it's pre filtered. If yes it may be
> ok to enable it there unconditionally at the kernel level.
>
Yes, Arch LBR should have much less overhead than the previous
platforms. The most common branches, JCC and near JMP/CALL, are from the
HW. Only the other branches, e.g., far call, SYS* etc, which still rely
on the SW disassemble. The number of the other branches should not be
big. I agree that we should enable the branch type for the Arch LBR
unconditionally at the kernel level.
Peter? Stephane? What do you think?
> Still have to decide if we want older parts to have more overhead by
> default. I guess would need some data on that.
The previous LBR already has high overhead. The branch type overhead
will make it worse. I think it's better keep it default off. I think we
can make it clear in the document that the branch type is only default
on for the new platforms with Arch LBR support (12th-Gen+ client or
4th-Gen Xeon+ server).
Thanks,
Kan