Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()

From: Leo Yan
Date: Fri Oct 11 2024 - 05:35:00 EST

Next message: Amelie Delaunay: "Re: [PATCH 06/11] dt-bindings: dma: stm32-dma3: introduce st,axi-max-burst-len property"
Previous message: Christian Loehle: "Re: [PATCH v7] sched: Consolidate cpufreq updates"
In reply to: Adrian Hunter: "Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()"
Next in thread: Leo Yan: "Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Adrian,

On 10/10/24 18:41, Adrian Hunter wrote:>

On 25/09/24 22:53, Leo Yan wrote:

The perf_cpu_map__merge() function has two arguments, 'orig' and 'other',
and it returns results for three different cases:

Case 1: 'other' is a subset of 'orig'.
Case 2: 'orig' is a subset of 'other'.
Case 3: 'orig' and 'other' are not subsets of each other.

The result combinations are:

+--------+-------------+-----------+-----------+
| Cases | Result | orig | other |
+--------+-------------+-----------+-----------+
| Case1 | orig | No change | No change |
+--------+-------------+-----------+-----------+
| Case2 | other | No change | refcnt++ |
+--------+-------------+-----------+-----------+
| Case3 | New CPU map | refcnt-- | No change |
+--------+-------------+-----------+-----------+

Both Case 1 and Case 3 have a risk of releasing maps unexpectedly. This
is because the reference counter operations are not consistent crossing
different cases and leads to difficulty for callers handling them.

For Case 1, because 'other' is a subset of 'orig', 'orig' is returned as
the merging result, but its refcnt is not incremented. This can lead to
the map being released repeatedly:

struct perf_cpu_map *a = perf_cpu_map__new("1,2");
struct perf_cpu_map *b = perf_cpu_map__new("2");

/* 'm' and 'a' point to the same CPU map */
struct perf_cpu_map *m = perf_cpu_map__merge(a, b);

...

perf_cpu_map__put(m); -> Release the map
perf_cpu_map__put(b);
perf_cpu_map__put(a); -> Release the same merged again

For Case 3, it is possible that the CPU map pointed to by 'orig' can be
released twice: within the function and outside of it.

struct perf_cpu_map *a = perf_cpu_map__new("1,2");
struct perf_cpu_map *b = perf_cpu_map__new("3");

struct perf_cpu_map *m = perf_cpu_map__merge(a, b);
`> 'm' is new allocated map. But 'a' has
been released in the function.
...

perf_cpu_map__put(m);
perf_cpu_map__put(b);
perf_cpu_map__put(a); -> Release the 'a' map again

This commit increases the reference counter if a passed map is returned.
For the case of a newly allocated map, it does not change the reference
counter for passed maps.

The 2 non-test uses of perf_cpu_map__merge both do:

a = perf_cpu_map__merge(a, b);

so another way to make the API less misleading would be
to introduce:

err = perf_cpu_map__merge_in(&a, b);

where:

int perf_cpu_map__merge_in(struct perf_cpu_map **orig, struct perf_cpu_map *other)
{
struct perf_cpu_map *result = perf_cpu_map__merge(*orig, other);

if (!result)
return -ENOMEM;

*orig = result;
return 0;
}

without any changes to perf_cpu_map__merge().

Just wandering why we cannot do the same thing for the perf_cpu_map__merge() function?

int perf_cpu_map__merge_in(struct perf_cpu_map **orig,
struct perf_cpu_map *other)

This can allow us to avoid any confusion in the first place. And we don't need to maintain two functions for the same thing.

Thanks,
Leo

Next message: Amelie Delaunay: "Re: [PATCH 06/11] dt-bindings: dma: stm32-dma3: introduce st,axi-max-burst-len property"
Previous message: Christian Loehle: "Re: [PATCH v7] sched: Consolidate cpufreq updates"
In reply to: Adrian Hunter: "Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()"
Next in thread: Leo Yan: "Re: [PATCH v1 1/5] libperf cpumap: Correct reference count for perf_cpu_map__merge()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]