Re: [PATCH v2 3/3] perf report: Implement visual marker for macro fusion in annotate

From: Jin, Yao
Date: Mon Jun 19 2017 - 21:54:58 EST



Reference for macro fusion is the optimization guide,
http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html
2.3.2.1
â In Intel microarchitecture code name Nehalem: CMP, TEST.
â In Intel microarchitecture code name Sandy Bridge: CMP, TEST, ADD, SUB,
AND, INC, DEC
â These instructions can fuse if The first source / destination operand is a
register.

The second source operand (if exists) is one of: immediate, register, or non
RIP-relative memory.
The second instruction of the macro-fusable pair is a conditional branch.

We probably don't need the full rules, just a simple test for
CMP/TEST/ADD/SUB/AND/INC/DEC and second instruction a Jcc condition branch.
Also I don't think we need to distinguish Nehalem/Sandy Bridge and other
core platforms. A simple test may be acceptable.
Humm, then we need to make sure somehow that this may or may not be
happening, with the above rules and optimization guide URL and pages
mentioned in the documentation.

I think that as we improve the disassembler, the more precise we can go
the better. If we know that the machine is x86 _and_ Nehalem, then we
should do this fusing visual cue onlyu for CMP and TEST, etc.

- Arnaldo

I will add checking for Nehalem (CMP, TEST). For other newer Intel CPUs just check it by default (CMP, TEST, ADD, SUB, AND, INC, DEC).

Thanks
Jin Yao