Re: [PATCH v3 1/5] perf cpumap: If the die_id is missing use socket/physical_package_id

From: Ian Rogers
Date: Fri Dec 20 2024 - 13:07:55 EST


On Fri, Dec 20, 2024 at 2:28 AM James Clark <james.clark@xxxxxxxxxx> wrote:
>
> On 19/12/2024 9:53 pm, Namhyung Kim wrote:
> > On Thu, Dec 19, 2024 at 09:20:39AM -0800, Ian Rogers wrote:
> >> On Wed, Dec 18, 2024 at 9:32 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >>>
> >>> On Wed, Dec 18, 2024 at 09:42:27AM -0800, Ian Rogers wrote:
> >>>> On Wed, Dec 18, 2024 at 4:04 AM James Clark <james.clark@xxxxxxxxxx> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 16/12/2024 11:24 pm, Ian Rogers wrote:
> >>>>>> An error value for a missing die_id may be written into things like
> >>>>>> the cpu_topology_map. As the topology needs to be fully written out,
> >>>>>> including the die_id, to allow perf.data file features to be aligned
> >>>>>> we can't allow error values to be written out. Instead base the
> >>>>>> missing die_id value off of the socket/physical_package_id assuming
> >>>>>> they correlate 1:1.
> >>>>>>
> >>>>>> Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> >>>>>> ---
> >>>>>> tools/perf/util/cpumap.c | 3 ++-
> >>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> >>>>>> index 27094211edd8..d362272f8466 100644
> >>>>>> --- a/tools/perf/util/cpumap.c
> >>>>>> +++ b/tools/perf/util/cpumap.c
> >>>>>> @@ -283,7 +283,8 @@ int cpu__get_die_id(struct perf_cpu cpu)
> >>>>>> {
> >>>>>> int value, ret = cpu__get_topology_int(cpu.cpu, "die_id", &value);
> >>>>>>
> >>>>>> - return ret ?: value;
> >>>>>> + /* If die_id is missing fallback on using the socket/physical_package_id. */
> >>>>>> + return ret || value < 0 ? cpu__get_socket_id(cpu) : value;
> >>>>>> }
> >>>>>>
> >>>>>> struct aggr_cpu_id aggr_cpu_id__die(struct perf_cpu cpu, void *data)
> >>>>>
> >>>>> Hi Ian,
> >>>>>
> >>>>> I sent a fix for the same or a similar problem here [1]. For this one
> >>>>> I'm not sure why we'd want to use the socket ID for die when it's always
> >>>>> been 0 for not present. I wonder if this change is mingling two things:
> >>>>> fixing the negative error value appearing and replacing die with socket ID.
> >>>>>
> >>>>> Personally I would prefer to keep the 0 to fix the error value, that way
> >>>>> nobody gets surprised by the change.
> >>>>>
> >>>>> Also it looks like cpu__get_cluster_id() can suffer from the same issue,
> >>>>> and if we do it this way we should drop these as they aren't valid anymore:
> >>>>>
> >>>>> /* There is no die_id on legacy system. */
> >>>>> if (die < 0)
> >>>>> die = 0;
> >>>>
> >>>> I think this breaks the assumption here:
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/expr.c?h=perf-tools-next#n244
> >>>>>> Hmm.. I'm not sure how it worked before. The code is already
> there and
> >>> it just changed the condition from == -1 to < 0, right?
> >>
> >> You'd need to be testing on a multi-socket machine to see the issue.
> >> If you had say a dual socket Ampere chip and the die_id was missing,
> >> does it make sense for there to be two sockets/packages but only 1
> >> die? I think it is best we assume 1 die per socket when the die_id is
> >> missing, and to some crippled extent (because of the s390 workaround)
> >> the expr test is doing the sanity check.
> >
> > AFAICS die ID is always used with socket ID already so I guess it means
> > an ID inside a socket.
> >
> > Thanks,
> > Namhyung
> >
>
> Don't we then need to update has_die_topology() to always return true if
> we're using socket ID as a placeholder? And the #num_dies expression
> should potentially become non-zero for consistency?
>
> Seems like there are two options:
>
> - Fully support "no die reported"
> * -errno is saved into the topology header, but it's still converted
> to 0 for Perf stat aggregation and display
> * #num_dies == 0 as it currently is
> * Anywhere else die is used it's checked for error values first
>
> - Replace die with package
> * #num_dies == #num_packages
> * has_die_topology() == true
> * delete code to display die as 0 in perf stat
>
> With this patch we get somewhere between the two.

Thanks James, I'll try to do a larger cleanup in this area. My purpose
for sending the patches was the unaligned feature fields such as CPU
topology cause data alignment issues in the perf.data file. When these
occur we have builds that turn undefined behavior (such as bad
alignment) into illegal instructions and we see widespread test
regressions. The extended pipe tests in v6.12 did this. So the rough
flow of what I've been going through is:
1) CPU topology has perf.data file alignment issues.
2) aligning the CPU topology breaks "old" perf.data files as the size
of the feature is used to detect whether die_id, etc. are present.
3) I've discovered a latent issue that ARM and S390 aren't exposing
die_id and falling into the "old" perf.data file case.
4) let's write out everything in the CPU topology even on ARM and S390
so that the old file test can be just that, but new files have all the
topology.
5) the CPU topology is a mish-mash of skipped tests and things working
not particularly by design.
6) we need to come up with values for die_id, etc. to fix the topology
but these things aren't well defined. I'm reminded of past discussions
of logical ids (unique per core/die/..) vs the normal ids perf users
which may be duplicated across cores.
Anyway, I'll see what I can do make this have a better API, etc.
inspired from your and Namhyung's feedback.

Thanks,
Ian