Re: [RFC] perf script AMD/IBS: Add scripts to show function/instruction level granular profile

From: Ravi Bangoria
Date: Mon Jan 27 2025 - 04:01:23 EST

Next message: Yu Kuai: "[PATCH 6.6 6/6] md/md-bitmap: move bitmap_{start, end}write to md upper layer"
Previous message: Yu Kuai: "[PATCH 6.6 2/6] md/md-bitmap: factor behind write counters out from bitmap_{start/end}write()"
In reply to: Ian Rogers: "Re: [RFC] perf script AMD/IBS: Add scripts to show function/instruction level granular profile"
Next in thread: Namhyung Kim: "Re: [RFC] perf script AMD/IBS: Add scripts to show function/instruction level granular profile"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Ian,

>> diff --git a/tools/perf/scripts/python/amd-ibs-fetch-metrics.py b/tools/perf/scripts/python/amd-ibs-fetch-metrics.py
>> new file mode 100644
>> index 000000000000..63a91843585f
>> --- /dev/null
>> +++ b/tools/perf/scripts/python/amd-ibs-fetch-metrics.py
>> @@ -0,0 +1,219 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +#
>> +# Copyright (C) 2025 Advanced Micro Devices, Inc.
>> +#
>> +# Print various metric events at function granularity using AMD IBS Fetch PMU.
>> +
>> +from __future__ import print_function
>
> I think at some future point we should go through the perf python code
> and strip out python2-isms like this. There's no need to add more as
> python2 doesn't exist any more.

Ack.

>> +allowed_sort_keys = ("nr_samples", "oc_miss", "ic_miss", "l2_miss", "l3_miss", "abort", "l1_itlb_miss", "l2_itlb_miss")
>> +default_sort_order = ("nr_samples",) # Trailing comman is needed for single member tuple
>
> Given these are lists of strings, I'm not sure why you're trying to use tuples?

I'm not a python expert, but AFAIU, tuple is the data-structure for
immutable list. No?

>> +data = {};
>> +
>> +def init_data_element(symbol, cpumode, dso):
>
> Consider types and using mypy? Fwiw, I sent this (reviewed but not merged):
> https://lore.kernel.org/lkml/20241025172303.77538-1-irogers@xxxxxxxxxx/
> which adds build support for mypy and pylint, although not enabled by
> default given the number of errors.

Sure. I'll explore this.

>> +def get_cpumode(cpumode):
>> + if (cpumode == 1):
>> + return 'K'
>> + if (cpumode == 2):
>> + return 'U'
>> + if (cpumode == 3):
>> + return 'H'
>> + if (cpumode == 4):
>> + return 'GK'
>> + if (cpumode == 5):
>> + return 'GU'
>> + return '?'
>
> Perhaps use a dictionary? Something like:
> ```
> def get_cpumode(cpumode: int)- > str:
> modes = {
> 1: 'K',
> 2: 'U',
> 3: 'H',
> 4: 'GK',
> 5: 'GU',
> }
> return modes[cpumode] if cpumode in modes else '?'
> ```

+1

>> + print("%-45s| %7d | %7d (%6.2f%%) %7d (%6.2f%%) %7d (%6.2f%%) %7d (%6.2f%%)"
>> + " %7d %7d | %7d (%6.2f%%) | %7d (%6.2f%%) %7d (%6.2f%%) | %s" %
>> + (symbol_cpumode, d[1]['nr_samples'], d[1]['oc_miss'], oc_miss_perc,
>> + d[1]['ic_miss'], ic_miss_perc, d[1]['l2_miss'], l2_miss_perc,
>> + d[1]['l3_miss'], l3_miss_perc, pct_lat, avg_lat, d[1]['abort'],
>> + abort_perc, d[1]['l1_itlb_miss'], l1_itlb_miss_perc,
>> + d[1]['l2_itlb_miss'], l2_itlb_miss_perc, d[1]['dso']))
>
> Fwiw, I'm letting gemini convert these to f-strings. If I trust AI this becomes:
> ```
> print(f"{symbol_cpumode:<45s}| {d[1]['nr_samples']:7d} |
> {d[1]['oc_miss']:7d} ({oc_miss_perc:6.2f}%) {d[1]['ic_miss']:7d}
> ({ic_miss_perc:6.2f}%) {d[1]['l2_miss']:7d} ({l2_miss_perc:6.2f}%)
> {d[1]['l3_miss']:7d} ({l3_miss_perc:6.2f}%) {pct_lat:7d} {avg_lat:7d}
> | {d[1]['abort']:7d} ({abort_perc:6.2f}%) | {d[1]['l1_itlb_miss']:7d}
> ({l1_itlb_miss_perc:6.2f}%) {d[1]['l2_itlb_miss']:7d}
> ({l2_itlb_miss_perc:6.2f}%) | {d[1]['dso']:s}")
> ```
> But given that keeping all these prints in sync is error prone, I
> think a helper function is the way to go.

Sure. will convert it into a helper function.

>> +annotate_symbol = None
>> +annodate_dso = None
>
> annotate_dso?

Ack.

>> +def disassemble_symbol(symbol, dso):
>> + global data
>> +
>> + readelf = subprocess.Popen(["readelf", "-WsC", "--sym-base=16", dso],
>> + stdout=subprocess.PIPE, text=True)
>> + grep = subprocess.Popen(["grep", "-w", symbol], stdin=readelf.stdout,
>> + stdout=subprocess.PIPE, text=True)
>> + output, error = grep.communicate()
>
> Perhaps the pyelftools would be better here?
> https://eli.thegreenplace.net/2012/01/06/pyelftools-python-library-for-parsing-elf-and-dwarf

Right, using library instead of hardcoded shell command would be
better.

Thanks for the feedback,
Ravi

Next message: Yu Kuai: "[PATCH 6.6 6/6] md/md-bitmap: move bitmap_{start, end}write to md upper layer"
Previous message: Yu Kuai: "[PATCH 6.6 2/6] md/md-bitmap: factor behind write counters out from bitmap_{start/end}write()"
In reply to: Ian Rogers: "Re: [RFC] perf script AMD/IBS: Add scripts to show function/instruction level granular profile"
Next in thread: Namhyung Kim: "Re: [RFC] perf script AMD/IBS: Add scripts to show function/instruction level granular profile"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]