Re: [PATCH 2/2] perf tools: Improve IBS error handling

From: Kim Phillips
Date: Tue Nov 23 2021 - 10:26:10 EST


On 11/23/21 2:40 AM, kajoljain wrote:
On 10/8/21 12:47 AM, Kim Phillips wrote:
On 10/7/21 12:28 PM, Jiri Olsa wrote:
On Mon, Oct 04, 2021 at 04:41:14PM -0500, Kim Phillips wrote:
---
  tools/perf/util/evsel.c | 24 ++++++++++++++++++++++++
  1 file changed, 24 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index b915840690d4..f8a9cbd99314 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2743,9 +2743,22 @@ static bool find_process(const char *name)
      return ret ? false : true;
  }
  +static bool is_amd(const char *arch, const char *cpuid)
+{
+    return arch && !strcmp("x86", arch) && cpuid && strstarts(cpuid,
"AuthenticAMD");
+}
+
+static bool is_amd_ibs(struct evsel *evsel)
+{
+    return evsel->core.attr.precise_ip || !strncmp(evsel->pmu_name,
"ibs", 3);
+}
+
  int evsel__open_strerror(struct evsel *evsel, struct target *target,
               int err, char *msg, size_t size)
  {
+    struct perf_env *env = evsel__env(evsel);
+    const char *arch = perf_env__arch(env);
+    const char *cpuid = perf_env__cpuid(env);
      char sbuf[STRERR_BUFSIZE];
      int printed = 0, enforced = 0;
  @@ -2841,6 +2854,17 @@ int evsel__open_strerror(struct evsel
*evsel, struct target *target,
              return scnprintf(msg, size, "wrong clockid (%d).",
clockid);
          if (perf_missing_features.aux_output)
              return scnprintf(msg, size, "The 'aux_output' feature
is not supported, update the kernel.");
+        if (is_amd(arch, cpuid)) {
+            if (is_amd_ibs(evsel)) {

would single 'is_amd_ibs' call be better? checking on both amd and ibs

Good suggestion. If you look at the later patch in the
BRS series, I have rewritten it to add the new
AMD PMU like so:

 if (is_amd()) {
     if (is_amd_ibs()) {
         if (evsel->this)
             return
         if (evsel->that)
             return
     }
+    if (is_amd_brs()) {
+        if (evsel->this)
+            return
+        if (evsel->that)
+            return
+    }
 }

Hi Kim,
From my point of view, it won't be a good idea of adding so many
checks in common function definition itself.
Can you just create a check to see if its amd machine and then add a
function call which will handle all four conditions together?

which is basically for:

+ if (is_amd(arch, cpuid)) {
+ if (is_amd_ibs(evsel)) {
+ if (evsel->core.attr.exclude_kernel)
+ return scnprintf(msg, size,
+ "AMD IBS can't exclude kernel events. Try running at a higher
privilege level.");
+ if (!evsel->core.system_wide)
+ return scnprintf(msg, size,
+ "AMD IBS may only be available in system-wide/per-cpu mode. Try using
-a, or -C and workload affinity");
+ }

and this:

+ if (is_amd_brs(evsel)) {
+ if (evsel->core.attr.freq)
+ return scnprintf(msg, size,
+ "AMD Branch Sampling does not support frequency mode sampling, must
pass a fixed sampling period via -c option or
cpu/branch-brs,period=xxxx/.");
+ /* another reason is that the period is too small */
+ return scnprintf(msg, size,
+ "AMD Branch Sampling does not support sampling period smaller than
what is reported in /sys/devices/cpu/caps/branches.");
+ }

IIRC, I tried something like that but carrying the


struct target *target, int err, char *msg, size_t size

parameters made things worse.

So, incase we are in amd machine, common function evsel__open_strerror
will call function may be something like amd_evesel_open_strerror_check
which will look for both ibs and brs conditions and return corresponding
error statement.

The vast majority of decisions made by evsel__open_strerror are
going to be common across most arch/uarches. AMD has only these
two pesky exceptions to the rule and therefore IMO it's ok
to have them inline with the common function, since the decisions
are so deeply intertwined. A new amd_evsel_open_strerror_check
sounds like it'd duplicate too much of the common function code
in order to handle the common error cases.

Kim