Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()

From: Adrian Hunter
Date: Thu Oct 10 2024 - 13:42:15 EST


On 25/09/24 22:53, Leo Yan wrote:
> The perf_cpu_map__merge() function has two arguments, 'orig' and 'other',
> and it returns results for three different cases:
>
> Case 1: 'other' is a subset of 'orig'.
> Case 2: 'orig' is a subset of 'other'.
> Case 3: 'orig' and 'other' are not subsets of each other.
>
> The result combinations are:
>
> +--------+-------------+-----------+-----------+
> | Cases | Result | orig | other |
> +--------+-------------+-----------+-----------+
> | Case1 | orig | No change | No change |
> +--------+-------------+-----------+-----------+
> | Case2 | other | No change | refcnt++ |
> +--------+-------------+-----------+-----------+
> | Case3 | New CPU map | refcnt-- | No change |
> +--------+-------------+-----------+-----------+
>
> Both Case 1 and Case 3 have a risk of releasing maps unexpectedly. This
> is because the reference counter operations are not consistent crossing
> different cases and leads to difficulty for callers handling them.
>
> For Case 1, because 'other' is a subset of 'orig', 'orig' is returned as
> the merging result, but its refcnt is not incremented. This can lead to
> the map being released repeatedly:
>
> struct perf_cpu_map *a = perf_cpu_map__new("1,2");
> struct perf_cpu_map *b = perf_cpu_map__new("2");
>
> /* 'm' and 'a' point to the same CPU map */
> struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
>
> ...
>
> perf_cpu_map__put(m); -> Release the map
> perf_cpu_map__put(b);
> perf_cpu_map__put(a); -> Release the same merged again
>
> For Case 3, it is possible that the CPU map pointed to by 'orig' can be
> released twice: within the function and outside of it.
>
> struct perf_cpu_map *a = perf_cpu_map__new("1,2");
> struct perf_cpu_map *b = perf_cpu_map__new("3");
>
> struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
> `> 'm' is new allocated map. But 'a' has
> been released in the function.
> ...
>
> perf_cpu_map__put(m);
> perf_cpu_map__put(b);
> perf_cpu_map__put(a); -> Release the 'a' map again
>
> This commit increases the reference counter if a passed map is returned.
> For the case of a newly allocated map, it does not change the reference
> counter for passed maps.

The 2 non-test uses of perf_cpu_map__merge both do:

a = perf_cpu_map__merge(a, b);

so another way to make the API less misleading would be
to introduce:

err = perf_cpu_map__merge_in(&a, b);

where:

int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other)
{
struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other);

if (!result)
return -ENOMEM;

*orig = result;
return 0;
}

without any changes to perf_cpu_map__merge().

>
> Signed-off-by: Leo Yan <leo.yan@xxxxxxx>
> ---
> tools/lib/perf/cpumap.c | 11 +++--------
> 1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> index cae799ad44e1..3f80eade8b1c 100644
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -438,9 +438,7 @@ bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu
> /*
> * Merge two cpumaps
> *
> - * orig either gets freed and replaced with a new map, or reused
> - * with no reference count change (similar to "realloc")
> - * other has its reference count increased.
> + * Return a new map, or reused with its reference count increased.
> */
>
> struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
> @@ -452,11 +450,9 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
> struct perf_cpu_map *merged;
>
> if (perf_cpu_map__is_subset(orig, other))
> - return orig;
> - if (perf_cpu_map__is_subset(other, orig)) {
> - perf_cpu_map__put(orig);
> + return perf_cpu_map__get(orig);
> + if (perf_cpu_map__is_subset(other, orig))
> return perf_cpu_map__get(other);
> - }
>
> tmp_len = __perf_cpu_map__nr(orig) + __perf_cpu_map__nr(other);
> tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> @@ -483,7 +479,6 @@ struct perf_cpu_map *perf_cpu_map__merge(struct perf_cpu_map *orig,
>
> merged = cpu_map__trim_new(k, tmp_cpus);
> free(tmp_cpus);
> - perf_cpu_map__put(orig);
> return merged;
> }
>