Re: [PATCHSET 0/7] perf lock contention: Improve performance if map is full (v1)

From: Ian Rogers
Date: Thu Apr 06 2023 - 20:35:22 EST


On Thu, Apr 6, 2023 at 2:06 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> Hello,
>
> I got a report that the overhead of perf lock contention is too big in
> some cases. It was running the task aggregation mode (-t) at the moment
> and there were lots of tasks contending each other.
>
> It turned out that the hash map update is a problem. The result is saved
> in the lock_stat hash map which is pre-allocated. The BPF program never
> deletes data in the map, but just adds. But if the map is full, (try to)
> update the map becomes a very heavy operation - since it needs to check
> every CPU's freelist to get a new node to save the result. But we know
> it'd fail when the map is full. No need to update then.
>
> I've checked it on my 64 CPU machine with this.
>
> $ perf bench sched messaging -g 1000
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1000 groups == 40000 processes run
>
> Total time: 2.825 [sec]
>
> And I used the task mode, so that it can guarantee the map is full.
> The default map entry size is 16K and this workload has 40K tasks.
>
> Before:
> $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1000 groups == 40000 processes run
>
> Total time: 11.299 [sec]
> contended total wait max wait avg wait pid comm
>
> 19284 3.51 s 3.70 ms 181.91 us 1305863 sched-messaging
> 243 84.09 ms 466.67 us 346.04 us 1336608 sched-messaging
> 177 66.35 ms 12.08 ms 374.88 us 1220416 node
>
> After:
> $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1000 groups == 40000 processes run
>
> Total time: 3.044 [sec]
> contended total wait max wait avg wait pid comm
>
> 18743 591.92 ms 442.96 us 31.58 us 1431454 sched-messaging
> 51 210.64 ms 207.45 ms 4.13 ms 1468724 sched-messaging
> 81 68.61 ms 65.79 ms 847.07 us 1463183 sched-messaging
>
> === output for debug ===
>
> bad: 1164137, total: 2253341
> bad rate: 51.66 %
> histogram of failure reasons
> task: 0
> stack: 0
> time: 0
> data: 1164137
>
> The first few patches are small cleanups and fixes. You can get the code
> from 'perf/lock-map-v1' branch in
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
> Namhyung Kim (7):
> perf lock contention: Simplify parse_lock_type()
> perf lock contention: Use -M for --map-nr-entries
> perf lock contention: Update default map size to 16384
> perf lock contention: Add data failure stat
> perf lock contention: Update total/bad stats for hidden entries
> perf lock contention: Revise needs_callstack() condition
> perf lock contention: Do not try to update if hash map is full

Series:
Acked-by: Ian Rogers <irogers@xxxxxxxxxx>

Thanks,
Ian

> tools/perf/Documentation/perf-lock.txt | 4 +-
> tools/perf/builtin-lock.c | 64 ++++++++-----------
> tools/perf/util/bpf_lock_contention.c | 7 +-
> .../perf/util/bpf_skel/lock_contention.bpf.c | 29 +++++++--
> tools/perf/util/bpf_skel/lock_data.h | 3 +
> tools/perf/util/lock-contention.h | 2 +
> 6 files changed, 60 insertions(+), 49 deletions(-)
>
>
> base-commit: e5116f46d44b72ede59a6923829f68a8b8f84e76
> --
> 2.40.0.577.gac1e443424-goog
>