Re: [PATCH v1 01/20] perf jevents: Add RAPL metrics for all Intel models

From: Ian Rogers
Date: Thu Feb 29 2024 - 20:02:32 EST


On Thu, Feb 29, 2024 at 12:59 PM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
> > Add a 'cpu_power' metric group that computes the power consumption
> > from RAPL events if they are present.
> >
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > tools/perf/pmu-events/intel_metrics.py | 45 ++++++++++++++++++++++++--
> > 1 file changed, 42 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
> > index 4fbb31c9eccd..5827f555005f 100755
> > --- a/tools/perf/pmu-events/intel_metrics.py
> > +++ b/tools/perf/pmu-events/intel_metrics.py
> > @@ -1,9 +1,10 @@
> > #!/usr/bin/env python3
> > # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> > -from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
> > - MetricGroup)
> > +from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
> > + LoadEvents, Metric, MetricGroup, Select)
> > import argparse
> > import json
> > +import math
> > import os
> >
> > parser = argparse.ArgumentParser(description="Intel perf json generator")
> > @@ -14,7 +15,45 @@ args = parser.parse_args()
> > directory = f"{os.path.dirname(os.path.realpath(__file__))}/arch/x86/{args.model}/"
> > LoadEvents(directory)
> >
> > -all_metrics = MetricGroup("",[])
> > +interval_sec = Event("duration_time")
> > +
> > +def Rapl() -> MetricGroup:
> > + """Processor socket power consumption estimate.
> > +
> > + Use events from the running average power limit (RAPL) driver.
> > + """
> > + # Watts = joules/second
> > + pkg = Event("power/energy\-pkg/")
> > + cond_pkg = Select(pkg, has_event(pkg), math.nan)
> > + cores = Event("power/energy\-cores/")
> > + cond_cores = Select(cores, has_event(cores), math.nan)
> > + ram = Event("power/energy\-ram/")
> > + cond_ram = Select(ram, has_event(ram), math.nan)
> > + gpu = Event("power/energy\-gpu/")
> > + cond_gpu = Select(gpu, has_event(gpu), math.nan)
> > + psys = Event("power/energy\-psys/")
> > + cond_psys = Select(psys, has_event(psys), math.nan)
> > + scale = 2.3283064365386962890625e-10
> > + metrics = [
> > + Metric("cpu_power_pkg", "",
> > + d_ratio(cond_pkg * scale, interval_sec), "Watts"),
> > + Metric("cpu_power_cores", "",
> > + d_ratio(cond_cores * scale, interval_sec), "Watts"),
> > + Metric("cpu_power_ram", "",
> > + d_ratio(cond_ram * scale, interval_sec), "Watts"),
> > + Metric("cpu_power_gpu", "",
> > + d_ratio(cond_gpu * scale, interval_sec), "Watts"),
> > + Metric("cpu_power_psys", "",
> > + d_ratio(cond_psys * scale, interval_sec), "Watts"),
> > + ]
> > +
> > + return MetricGroup("cpu_power", metrics,
> > + description="Processor socket power consumption estimates")
>
> As far as I know, the RAPL counters are to monitor energy consumption
> across different domains. The scope may not always be a socket. I think
> the description may brings confusions.
> Maybe we just call it "RAPL power consumption estimates", or "Running
> Average Power Limit (RAPL) power consumption estimates".

Ack. Will fix in v2.

Thanks,
Ian

> Thanks,
> Kan
> > +
> > +
> > +all_metrics = MetricGroup("", [
> > + Rapl(),
> > +])
> >
> > if args.metricgroups:
> > print(JsonEncodeMetricGroupDescriptions(all_metrics))