[PATCHSET 0/6] perf lock: Random updates for the locking analysis (v2)

From: Namhyung Kim
Date: Wed Jan 26 2022 - 19:00:59 EST


Hello,

I have some updates in the perf lock command (focused on 'report').
The main change is to add -c (or --combine-locks) option to aggregate
results based on lock class name.

* changes from v1)
- rebased onto recent acme/perf/core
- add Jiri's Acked-by

Without this option, the result deals with lock addresses so instances
in the same lock class will have separate entries like below:

# perf lock report
Name acquired contended avg wait (ns) total wait (ns) max wait (ns) min wait (ns)

rcu_read_lock 251225 0 0 0 0 0
&(ei->i_block_re... 8731 0 0 0 0 0
&sb->s_type->i_l... 8731 0 0 0 0 0
hrtimer_bases.lo... 5261 0 0 0 0 0
hrtimer_bases.lo... 2626 0 0 0 0 0
hrtimer_bases.lo... 1953 0 0 0 0 0
hrtimer_bases.lo... 1382 0 0 0 0 0
cpu_hotplug_lock... 1350 0 0 0 0 0
hrtimer_bases.lo... 1273 0 0 0 0 0
hrtimer_bases.lo... 1269 0 0 0 0 0
hrtimer_bases.lo... 1198 0 0 0 0 0
hrtimer_bases.lo... 1116 0 0 0 0 0
&base->lock 1109 0 0 0 0 0
hrtimer_bases.lo... 1067 0 0 0 0 0
hrtimer_bases.lo... 1052 0 0 0 0 0
hrtimer_bases.lo... 957 0 0 0 0 0
hrtimer_bases.lo... 948 0 0 0 0 0
css_set_lock 791 0 0 0 0 0
hrtimer_bases.lo... 752 0 0 0 0 0
&lruvec->lru_loc... 747 5 11254 56272 18317 1412
hrtimer_bases.lo... 738 0 0 0 0 0
&newf->file_lock... 706 15 1025 15388 2279 618
hrtimer_bases.lo... 702 0 0 0 0 0
hrtimer_bases.lo... 694 0 0 0 0 0
...

With -c option, the hrtimer_bases.lock would be combined into a single
entry. Also note that the lock names are correctly displayed now.

# perf lock report -c
Name acquired contended avg wait (ns) total wait (ns) max wait (ns) min wait (ns)

rcu_read_lock 251225 0 0 0 0 0
hrtimer_bases.lock 39449 0 0 0 0 0
&sb->s_type->i_l... 10301 1 662 662 662 662
ptlock_ptr(page) 10173 2 701 1402 760 642
&(ei->i_block_re... 8732 0 0 0 0 0
&base->lock 6705 0 0 0 0 0
&p->pi_lock 5549 0 0 0 0 0
&dentry->d_lockr... 5010 4 1274 5097 1844 789
&ep->lock 2750 0 0 0 0 0
&(__futex_data.q... 2331 0 0 0 0 0
(null) 1878 0 0 0 0 0
cpu_hotplug_lock 1350 0 0 0 0 0
&____s->seqcount 1349 0 0 0 0 0
&newf->file_lock 1001 15 1025 15388 2279 618
...

Maybe we can make it default later (with a config and --no-combine-locks).

You can get it from 'perf/lock-combine-v2' branch at

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (6):
perf lock: Convert lockhash_table to use hlist
perf lock: Change type of lock_stat->addr to u64
perf lock: Sort map info based on class name
perf lock: Fix lock name length check for printing
perf lock: Add -c/--combine-locks option
perf lock: Carefully combine lock stats for discarded entries

tools/perf/Documentation/perf-lock.txt | 4 +
tools/perf/builtin-lock.c | 155 +++++++++++++++++++------
2 files changed, 124 insertions(+), 35 deletions(-)


base-commit: e783362eb54cd99b2cac8b3a9aeac942e6f6ac07
--
2.35.0.rc0.227.g00780c9af4-goog