[PATCH 00/14] perf c2c: add a function view
From: Jiebin Sun
Date: Fri Jun 26 2026 - 03:03:23 EST
This series adds a new "function view" to perf c2c report, on top of the
existing cacheline view. It does not change the cacheline view; it adds a
second, complementary way to look at the same cache-to-cache (C2C) data.
Motivation
==========
The existing perf c2c report is organized by cacheline: it lists the most
contended cachelines and the symbols touching each one. That answers
"which cachelines are hot", but two common follow-up questions are harder
to answer directly:
- Which functions are contending with each other?
- For a given function, which other functions does it share cachelines
with, and over which specific cachelines?
The function view reorganizes the same data around functions to answer
these directly, which is useful when chasing false sharing and cross-core
contention at the source-function level.
What it does
============
In the perf c2c TUI, press TAB in the cacheline view to switch to the
function view. It presents a 3-level hierarchy:
Level 1: primary functions, sorted by Cycles % (estimated load cycles:
HITM, peer-snoop and other-load cycles -- on systems whose
default display mode is peer, such as Arm64, the peer-snoop
component dominates)
Level 2: other functions that share cachelines with the level-1 function
Level 3: the specific shared cachelines for each function pair
Keys in the function view:
TAB/ESC/q return to the cacheline view
d show cacheline details for the selected entry
e / + expand / collapse the selected entry
? help
The cacheline view and the --stdio output are unchanged.
Example
=======
A level-1 function is expanded (press 'e') to reveal the functions it
shares cachelines with, and one of those is expanded again to reveal the
specific shared cachelines:
Shared Data Functions Table (27 entries, sorted on Cycles %)
Cycles Store
% count Code address Symbol Cacheline
----------------------------------------------------------------------
- 39.03% 541 - 0xffffffffa2fc5b08 - [k] cpupri_set
450 - 0xffffffffa2fa28a5 - [k] pull_rt_task
450 0xff2d0082809da080
Reading the three levels:
- Level 1: cpupri_set is the top contended function, accounting for
39.03% of the estimated load cycles. The table is sorted by this
Cycles % column.
- Level 2: expanding cpupri_set (press 'e') lists the functions it
shares cachelines with, sorted by store count. Here pull_rt_task is
the contending function, with 450 stores into the shared data.
- Level 3: expanding pull_rt_task lists the specific cachelines the two
functions contend over -- in this case the single cacheline at
0xff2d0082809da080.
The view reads top-down as "cpupri_set is hottest; it shares data with
pull_rt_task; the contention is on cacheline ...da080" -- the false-
sharing chain that the cacheline view otherwise makes you reconstruct by
hand.
Implementation
==============
The function view is built as a separate hist_browser in
tools/perf/ui/browsers/c2c-function.c. Shared types and helpers used by
both views are factored out of builtin-c2c.c into a new c2c.h. The
hierarchy is constructed from the existing cacheline histograms into a
dedicated set of hists, keyed by (symbol, instruction address), and
rendered with custom column formatters.
The series is split into 14 small, self-contained patches so each step
can be reviewed and builds on its own.
Testing
=======
- Each of the 14 commits builds individually and as a full series.
- perf c2c report --stdio (cacheline view) output is unchanged versus
the baseline: identical trace-event totals, shared-cacheline counts,
and HITM tallies.
- The function view was exercised on c2c recordings; the level-1
ordering and the level-2/3 sharing breakdown match the underlying
cacheline data.
Jiebin Sun (14):
perf c2c: extract shared data structures into c2c.h
perf c2c: add function view browser skeleton
perf c2c: add function view type definitions and helpers
perf c2c: add column format infrastructure for function view
perf c2c: add column entry functions for function view
perf c2c: add comparison functions for function view sorting
perf c2c: add dimension definitions and format creation
perf c2c: add HPP list parsing for function view histograms
perf c2c: add stats merging and memory management helpers
perf c2c: add hierarchy entry creation and lookup functions
perf c2c: add function view hierarchy builder
perf c2c: add function view browser UI
perf c2c: add TAB key to switch to function view
perf c2c: document function view in perf-c2c man page
tools/perf/Documentation/perf-c2c.txt | 15 +
tools/perf/builtin-c2c.c | 128 +-
tools/perf/c2c.h | 140 +++
tools/perf/ui/browsers/Build | 1 +
tools/perf/ui/browsers/c2c-function.c | 1547 +++++++++++++++++++++++++
5 files changed, 1713 insertions(+), 118 deletions(-)
create mode 100644 tools/perf/c2c.h
create mode 100644 tools/perf/ui/browsers/c2c-function.c
base-commit: 40db90ac9f66c8246c1746c56d397283d161655c
--
2.52.0