On 25/09/24 22:53, Leo Yan wrote:
The perf_cpu_map__merge() function has two arguments, 'orig' and 'other',
and it returns results for three different cases:
Case 1: 'other' is a subset of 'orig'.
Case 2: 'orig' is a subset of 'other'.
Case 3: 'orig' and 'other' are not subsets of each other.
The result combinations are:
+--------+-------------+-----------+-----------+
| Cases | Result | orig | other |
+--------+-------------+-----------+-----------+
| Case1 | orig | No change | No change |
+--------+-------------+-----------+-----------+
| Case2 | other | No change | refcnt++ |
+--------+-------------+-----------+-----------+
| Case3 | New CPU map | refcnt-- | No change |
+--------+-------------+-----------+-----------+
Both Case 1 and Case 3 have a risk of releasing maps unexpectedly. This
is because the reference counter operations are not consistent crossing
different cases and leads to difficulty for callers handling them.
For Case 1, because 'other' is a subset of 'orig', 'orig' is returned as
the merging result, but its refcnt is not incremented. This can lead to
the map being released repeatedly:
struct perf_cpu_map *a = perf_cpu_map__new("1,2");
struct perf_cpu_map *b = perf_cpu_map__new("2");
/* 'm' and 'a' point to the same CPU map */
struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
...
perf_cpu_map__put(m); -> Release the map
perf_cpu_map__put(b);
perf_cpu_map__put(a); -> Release the same merged again
For Case 3, it is possible that the CPU map pointed to by 'orig' can be
released twice: within the function and outside of it.
struct perf_cpu_map *a = perf_cpu_map__new("1,2");
struct perf_cpu_map *b = perf_cpu_map__new("3");
struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
`> 'm' is new allocated map. But 'a' has
been released in the function.
...
perf_cpu_map__put(m);
perf_cpu_map__put(b);
perf_cpu_map__put(a); -> Release the 'a' map again
This commit increases the reference counter if a passed map is returned.
For the case of a newly allocated map, it does not change the reference
counter for passed maps.
The 2 non-test uses of perf_cpu_map__merge both do:
a = perf_cpu_map__merge(a, b);
so another way to make the API less misleading would be
to introduce:
err = perf_cpu_map__merge_in(&a, b);
where:
int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other)
{
struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other);
if (!result)
return -ENOMEM;
*orig = result;
return 0;
}
without any changes to perf_cpu_map__merge().