Re: [PATCH] perf test: Fix perf test 11 hwmon endianess issue

From: Ian Rogers
Date: Sun Feb 02 2025 - 01:48:37 EST


On Sat, Feb 1, 2025, 9:14 AM David Laight <david.laight.linux@xxxxxxxxx> wrote:
>
> On Sat, 1 Feb 2025 08:12:38 -0800
> Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> > On Sat, Feb 1, 2025 at 3:34 AM David Laight
> > <david.laight.linux@xxxxxxxxx> wrote:
> > >
> > > On Fri, 31 Jan 2025 12:24:00 +0100
> > > Thomas Richter <tmricht@xxxxxxxxxxxxx> wrote:
> > >
> > > > perf test 11 hwmon fails on s390 with this error
> > > >
> > > > # ./perf test -Fv 11
> > > > --- start ---
> > > > ---- end ----
> > > > 11.1: Basic parsing test : Ok
> > > > --- start ---
> > > > Testing 'temp_test_hwmon_event1'
> > > > Using CPUID IBM,3931,704,A01,3.7,002f
> > > > temp_test_hwmon_event1 -> hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/
> > > > FAILED tests/hwmon_pmu.c:189 Unexpected config for
> > > > 'temp_test_hwmon_event1', 292470092988416 != 655361
> > > > ---- end ----
> > > > 11.2: Parsing without PMU name : FAILED!
> > > > --- start ---
> > > > Testing 'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/'
> > > > FAILED tests/hwmon_pmu.c:189 Unexpected config for
> > > > 'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/',
> > > > 292470092988416 != 655361
> > > > ---- end ----
> > > > 11.3: Parsing with PMU name : FAILED!
> > > > #
> > > >
> > > > The root cause is in member test_event::config which is initialized
> > > > to 0xA0001 or 655361. During event parsing a long list event parsing
> > > > functions are called and end up with this gdb call stack:
> > > ...
> > > > However member key::type_and_num is defined as union and bit field:
> > > >
> > > > union hwmon_pmu_event_key {
> > > > long type_and_num;
> > > > struct {
> > > > int num :16;
> > > > enum hwmon_type type :8;
> > > > };
> > > > };
> > >
> > > That is entirely horrid.
> >
> > It also has the side-effect that if you initialize the struct bits in
> > the type_and_num will be undefined and sanitizers will try to trip you
> > up.
> >
> > > I'm surprised this even compiles:
> > > static size_t hwmon_pmu__event_hashmap_hash(long key, void *ctx __maybe_unused)
> > > {
> > > return ((union hwmon_pmu_event_key)key).type_and_num;
> > > }
> > > It has to be just 'return key', but I'm not sure what the hashmap code is doing.
> > >
> > > AFAICT the code is just trying to generate a value for the hashmap to hash on?
> > > Why not just use (type << 16 | num) instead of 'trying to be clever' with a union?
> >
> > The purpose wasn't so much to be 'clever' as you say. Perf event
> > configs, which is where type_and_num lives in the events, have their
> > bitfields described either by convention or by files in
> > /sys/devices/<pmu>/format. On an Intel laptop:
> > ```
> > $ cat /sys/devices/cpu/format/inv
> > config:23
> > ```
> > So I can go:
> > ```
> > $ perf stat -e 'inst_retired.any/inv=1/' true
> >
> > Performance counter stats for 'true':
> >
> > 1,069,756 inst_retired.any/inv=1/
> >
> > 0.001539265 seconds time elapsed
> >
> > 0.001719000 seconds user
> > 0.000000000 seconds sys
> > ```
> > and the configuration of my inst_retired.any event now has bit 23 in
> > the config set. Just as the format files try to break down what the
> > bitfields within a config u64 are doing, the union was trying to do
> > similar. Making an attempt to have the value be intention revealing
> > which the shift and masking may lose, for example, by making it look
> > like bits in the num could overlap with and modify the type.
>
> Except that is really doesn't work for big-endian at all.
> The 'fix' doesn't even really paper over the cracks properly.
>
> I'm not even sure the hashing works on 64bit BE.
> (assuming the int:n are also BE - but that is a different ABI option.)
> The 'num' and 'type' will be in the high 24 bits of the 64bit 'long'.
> If we assume the rest of the union is actually initialised the other
> bits are all zero.
> And I guess that the hash function is masking with 2^n-1 - so always gets zero.

No, as is typical in any hash table you always do a secondary hash:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/hashmap.h#n15

Thanks,
Ian

> David
>
> >
> > Thanks,
> > Ian
>