Re: [PATCH] perf lock contention: Clear lock addr after use

From: Arnaldo Carvalho de Melo
Date: Fri Oct 13 2023 - 19:00:51 EST


Em Thu, Sep 28, 2023 at 04:50:18PM -0700, Namhyung Kim escreveu:
> It checks the current lock to calculated the delta of contention time.

> The address is saved in the tstamp map which is allocated at begining of
> contention and released at end of contention.
>
> But it's possible for bpf_map_delete_elem() to fail. In that case, the

How can it fail?

You do:

pelem = bpf_map_lookup_elem(&tstamp, &pid);
if (!pelem || pelem->lock != ctx[0])
return 0;

So it is there, why would the removal using the same key fail?

The patch should work as-is, I'm just curious about what would make
there removal of a map entry that was successfully looked up on the same
contention_end prog to fail when being removed...

- Arnaldo

> element in the tstamp map kept for the current lock and it makes the
> next contention for the same lock tracked incorrectly. Specificially
> the next contention begin will see the existing element for the task and
> it'd just return. Then the next contention end will see the element and
> calculate the time using the timestamp for the previous begin.
>
> This can result in a large value for two small contentions happened from
> time to time. Let's clear the lock address so that it can be updated
> next time even if the bpf_map_delete_elem() failed.
>
> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> ---
> tools/perf/util/bpf_skel/lock_contention.bpf.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c
> index 4900a5dfb4a4..b11179452e19 100644
> --- a/tools/perf/util/bpf_skel/lock_contention.bpf.c
> +++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c
> @@ -389,6 +389,7 @@ int contention_end(u64 *ctx)
>
> duration = bpf_ktime_get_ns() - pelem->timestamp;
> if ((__s64)duration < 0) {
> + pelem->lock = 0;
> bpf_map_delete_elem(&tstamp, &pid);
> __sync_fetch_and_add(&time_fail, 1);
> return 0;
> @@ -422,6 +423,7 @@ int contention_end(u64 *ctx)
> data = bpf_map_lookup_elem(&lock_stat, &key);
> if (!data) {
> if (data_map_full) {
> + pelem->lock = 0;
> bpf_map_delete_elem(&tstamp, &pid);
> __sync_fetch_and_add(&data_fail, 1);
> return 0;
> @@ -445,6 +447,7 @@ int contention_end(u64 *ctx)
> data_map_full = 1;
> __sync_fetch_and_add(&data_fail, 1);
> }
> + pelem->lock = 0;
> bpf_map_delete_elem(&tstamp, &pid);
> return 0;
> }
> @@ -458,6 +461,7 @@ int contention_end(u64 *ctx)
> if (data->min_time > duration)
> data->min_time = duration;
>
> + pelem->lock = 0;
> bpf_map_delete_elem(&tstamp, &pid);
> return 0;
> }
> --
> 2.42.0.582.g8ccd20d70d-goog
>

--

- Arnaldo