Re: [PATCH 3/5] perf util: kernel profiling is disallowed only when perf_event_paranoid > 1

From: Arnaldo Carvalho de Melo
Date: Tue Aug 27 2019 - 09:44:16 EST


Em Mon, Aug 26, 2019 at 09:39:14PM -0400, Igor Lubashev escreveu:
> Perf was too restrictive about sysctl kernel.perf_event_paranoid. The
> kernel only disallows profiling when perf_event_paranoid > 1. Make perf do
> the same.

Thanks for following up on this, I added these notes to this cset commit
log message:

--------------------------------- 8< ------------------------------------

perf evsel: Kernel profiling is disallowed only when perf_event_paranoid > 1

Perf was too restrictive about sysctl kernel.perf_event_paranoid. The
kernel only disallows profiling when perf_event_paranoid > 1. Make perf
do the same.

Committer testing:

For a non-root user:

$ id
uid=1000(acme) gid=1000(acme) groups=1000(acme),10(wheel) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
$

Before:

We were restricting it to just userspace (:u suffix) even for a
workload started by the user:

$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
$ perf evlist
cycles:u
$ perf evlist -v
cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 8 of event 'cycles:u'
# Event count (approx.): 1040396
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
68.36% sleep libc-2.29.so [.] _dl_addr
27.33% sleep ld-2.29.so [.] dl_main
3.80% sleep ld-2.29.so [.] _dl_setup_hash
#
# (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
#
$
$

After:

When the kernel allows profiling the kernel in that scenario:

$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.023 MB perf.data (11 samples) ]
$ perf evlist
cycles
$ perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
$
$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 11 of event 'cycles'
# Event count (approx.): 1601964
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ..........................
#
28.14% sleep [kernel.vmlinux] [k] __rb_erase_color
27.21% sleep [kernel.vmlinux] [k] unmap_page_range
27.20% sleep ld-2.29.so [.] __tunable_get_val
15.24% sleep [kernel.vmlinux] [k] thp_get_unmapped_area
1.96% perf [kernel.vmlinux] [k] perf_event_exec
0.22% perf [kernel.vmlinux] [k] native_sched_clock
0.02% perf [kernel.vmlinux] [k] intel_bts_enable_local
0.00% perf [kernel.vmlinux] [k] native_write_msr
#
# (Tip: Boolean options have negative forms, e.g.: perf report --no-children)
#
$

Reported-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Signed-off-by: Igor Lubashev <ilubashe@xxxxxxxxxx>
Tested-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: Alexey Budankov <alexey.budankov@xxxxxxxxxxxxxxx>
Cc: James Morris <jmorris@xxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Suzuki Poulouse <suzuki.poulose@xxxxxxx>
Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
Link: http://lkml.kernel.org/r/1566869956-7154-4-git-send-email-ilubashe@xxxxxxxxxx
Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Author: Igor Lubashev <ilubashe@xxxxxxxxxx>
# Date: Mon Aug 26 21:39:14 2019 -0400
#
# On branch perf/core
# Changes to be committed:
# modified: tools/perf/util/evsel.c
#
# Untracked files:
# a
# a.c
# bla
# f_mode
# perf.data
# perf.data.old
# q
#
# ------------------------ >8 ------------------------
# Do not modify or remove the line above.
# Everything below it will be ignored.
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 7c704b8f0e5c..d4540bfe4574 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -282,7 +282,7 @@ struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)

static bool perf_event_can_profile_kernel(void)
{
- return perf_event_paranoid_check(-1);
+ return perf_event_paranoid_check(1);
}

struct evsel *perf_evsel__new_cycles(bool precise)