Re: [PATCH v1 1/5] perf/core: Define the common branch type classification

From: Jin, Yao
Date: Wed Apr 05 2017 - 20:09:20 EST




On 4/5/2017 12:09 AM, Arnaldo Carvalho de Melo wrote:
Em Tue, Apr 04, 2017 at 11:52:53PM +0800, Jin, Yao escreveu:

On 4/4/2017 10:18 PM, Arnaldo Carvalho de Melo wrote:
Adding the perf kernel maintainers to the CC list.

Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu:
It is often useful to know the branch types while analyzing branch
data. For example, a call is very different from a conditional branch.

Currently we have to look it up in binary while the binary may later
not be available and even the binary is available but user has to take
some time. It is very useful for user to check it directly in perf
report.

Perf already has support for disassembling the branch instruction
to get the branch type. The branch type is defined in lbr.c.

To keep consistent on kernel and userspace and make the classification
more common, the patch adds the common branch type classification
in perf_event.h.

Since the disassembling of branch instruction needs some overhead,
a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
needs to disassemble the branch instruction and record the branch
type.

Signed-off-by: Jin Yao <yao.jin@xxxxxxxxxxxxxxx>
---
include/uapi/linux/perf_event.h | 24 +++++++++++++++++++++++-
tools/include/uapi/linux/perf_event.h | 24 +++++++++++++++++++++++-
2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index d09a9cd..4d731fd 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT = 14, /* no flags */
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT = 15, /* no cycles */
+ PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT = 16, /* save branch type */
+
PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */
};
@@ -198,9 +200,27 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
PERF_SAMPLE_BRANCH_NO_CYCLES = 1U << PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
+ PERF_SAMPLE_BRANCH_TYPE_SAVE =
+ 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
+
PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
};
+/*
+ * Common flow change classification
+ */
+enum {
+ PERF_BR_NONE = 0, /* unknown */
+ PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */
+ PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */
+ PERF_BR_JMP = 1 << 3, /* jump */
+ PERF_BR_IND_JMP = 1 << 4, /* indirect jump */
+ PERF_BR_CALL = 1 << 5, /* call */
+ PERF_BR_IND_CALL = 1 << 6, /* indirect call */
+ PERF_BR_RET = 1 << 7, /* return */
+ PERF_BR_FAR_BRANCH = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */
Humm, wouldn't be better to have those in separate buckets? I.e.

PERF_BR_SYSCALL
PERF_BR_SYSRET,
PERF_BR_IRQ

etc?
There are also other types. I.e. abort, ..... I use FAR_BRANCH is because I
just want to differentiate between basic branch types and others. (others
may be too much and platform specific).
I understand that this is what you need right now, but "syscall",
"sysret", "irq", look generic enough, no?

Really, really arch specific stuff could indeed be lumped together in a
FAR_BRANCH, but those used as an example, above (/*
SYSCALL,SYSRET,IRQ,... */) seems potentially useful to have untangled?

After considerations, yes, you are right. I will fix this in v2.

Thanks
Jin Yao

And why a bitmask? /me reads a bit more... couldn't find a reason for
this:

+ type:9, /* branch type */

Do you have a reason to use 9 bits? Why not just:

enum {
PERF_BR_NONE = 0, /* unknown */
PERF_BR_JCC_FWD = 1, /* conditional forward jump */
PERF_BR_JCC_BWD = 2, /* conditional backward jump */
PERF_BR_JMP = 3, /* jump */
PERF_BR_IND_JMP = 4, /* indirect jump */
PERF_BR_CALL = 5, /* call */
PERF_BR_IND_CALL = 6, /* indirect call */
PERF_BR_RET = 7, /* return */
PERF_BR_FAR_BRANCH = 8, /* SYSCALL,SYSRET,IRQ,... */

And then use, say, 4 or 5 bits for that type field?

I must be missing something trivial ;-\

- Arnaldo
You are right. I made things more complicated. Yes, the definitions should
be clear and simple. I will redefine them in v2.
Thanks, I wasn't missing anything, uff :-)
Thanks
Jin Yao

+};
+
#define PERF_SAMPLE_BRANCH_PLM_ALL \
(PERF_SAMPLE_BRANCH_USER|\
PERF_SAMPLE_BRANCH_KERNEL|\
@@ -999,6 +1019,7 @@ union perf_mem_data_src {
* in_tx: running in a hardware transaction
* abort: aborting a hardware transaction
* cycles: cycles from last branch (or 0 if not supported)
+ * type: branch type
*/
struct perf_branch_entry {
__u64 from;
@@ -1008,7 +1029,8 @@ struct perf_branch_entry {
in_tx:1, /* in transaction */
abort:1, /* transaction abort */
cycles:16, /* cycle count to last branch */
- reserved:44;
+ type:9, /* branch type */
+ reserved:35;
};
#endif /* _UAPI_LINUX_PERF_EVENT_H */
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index d09a9cd..4d731fd 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift {
PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT = 14, /* no flags */
PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT = 15, /* no cycles */
+ PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT = 16, /* save branch type */
+
PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */
};
@@ -198,9 +200,27 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT,
PERF_SAMPLE_BRANCH_NO_CYCLES = 1U << PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT,
+ PERF_SAMPLE_BRANCH_TYPE_SAVE =
+ 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
+
PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
};
+/*
+ * Common flow change classification
+ */
+enum {
+ PERF_BR_NONE = 0, /* unknown */
+ PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */
+ PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */
+ PERF_BR_JMP = 1 << 3, /* jump */
+ PERF_BR_IND_JMP = 1 << 4, /* indirect jump */
+ PERF_BR_CALL = 1 << 5, /* call */
+ PERF_BR_IND_CALL = 1 << 6, /* indirect call */
+ PERF_BR_RET = 1 << 7, /* return */
+ PERF_BR_FAR_BRANCH = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */
+};
+
#define PERF_SAMPLE_BRANCH_PLM_ALL \
(PERF_SAMPLE_BRANCH_USER|\
PERF_SAMPLE_BRANCH_KERNEL|\
@@ -999,6 +1019,7 @@ union perf_mem_data_src {
* in_tx: running in a hardware transaction
* abort: aborting a hardware transaction
* cycles: cycles from last branch (or 0 if not supported)
+ * type: branch type
*/
struct perf_branch_entry {
__u64 from;
@@ -1008,7 +1029,8 @@ struct perf_branch_entry {
in_tx:1, /* in transaction */
abort:1, /* transaction abort */
cycles:16, /* cycle count to last branch */
- reserved:44;
+ type:9, /* branch type */
+ reserved:35;
};
#endif /* _UAPI_LINUX_PERF_EVENT_H */
--
2.7.4