Re: [PATCH v8 00/12] Introduce CAP_PERFMON to secure system performance monitoring and observability

From: Arnaldo Carvalho de Melo
Date: Tue Apr 07 2020 - 12:37:01 EST


Em Tue, Apr 07, 2020 at 05:54:27PM +0300, Alexey Budankov escreveu:
> On 07.04.2020 17:35, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Apr 07, 2020 at 11:30:14AM -0300, Arnaldo Carvalho de Melo escreveu:
> >> [perf@five ~]$ type perf
> >> perf is hashed (/home/perf/bin/perf)
> >> [perf@five ~]$ getcap /home/perf/bin/perf
> >> /home/perf/bin/perf = cap_sys_ptrace,cap_syslog,38+ep
> >> [perf@five ~]$ groups
> >> perf perf_users
> >> [perf@five ~]$ id
> >> uid=1002(perf) gid=1002(perf) groups=1002(perf),1003(perf_users) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> >> [perf@five ~]$ perf top --stdio
> >> Error:
> >> Failed to mmap with 1 (Operation not permitted)
> >> [perf@five ~]$ perf record -a
> >> ^C[ perf record: Woken up 1 times to write data ]
> >> [ perf record: Captured and wrote 1.177 MB perf.data (1552 samples) ]
> >>
> >> [perf@five ~]$ perf evlist
> >> cycles:u
> >> [perf@five ~]$
> >
> > Humm, perf record falls back to cycles:u after initially trying cycles
> > (i.e. kernel and userspace), lemme see trying 'perf top -e cycles:u',
> > lemme test, humm not really:
> >
> > [perf@five ~]$ perf top --stdio -e cycles:u
> > Error:
> > Failed to mmap with 1 (Operation not permitted)
> > [perf@five ~]$ perf record -e cycles:u -a sleep 1
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 1.123 MB perf.data (132 samples) ]
> > [perf@five ~]$
> >
> > Back to debugging this.
>
> Could makes sense adding cap_ipc_lock to the binary to isolate from this:
>
> kernel/events/core.c: 6101
> if ((locked > lock_limit) && perf_is_paranoid() &&
> !capable(CAP_IPC_LOCK)) {
> ret = -EPERM;
> goto unlock;
> }


That did the trick, I'll update the documentation and include in my
"Committer testing" section:

[perf@five ~]$ groups
perf perf_users
[perf@five ~]$ ls -lahF bin/perf
-rwxr-x---. 1 root perf_users 24M Apr 7 10:34 bin/perf*
[perf@five ~]$ getcap bin/perf
bin/perf = cap_ipc_lock,cap_sys_ptrace,cap_syslog,38+ep
[perf@five ~]$
[perf@five ~]$ perf top --stdio


PerfTop: 652 irqs/sec kernel:73.8% exact: 99.7% lost: 0/0 drop: 0/0 [4000Hz cycles:u], (all, 12 CPUs)
---------------------------------------------------------------------------------------------------------------

13.03% [kernel] [k] module_get_kallsym
5.25% [kernel] [k] kallsyms_expand_symbol.constprop.0
5.00% libc-2.30.so [.] __GI_____strtoull_l_internal
4.41% [kernel] [k] memcpy
3.42% [kernel] [k] vsnprintf
2.98% perf [.] map__process_kallsym_symbol
2.86% [kernel] [k] format_decode
2.73% [kernel] [k] number
2.70% perf [.] rb_next
2.59% perf [.] maps__split_kallsyms
2.54% [kernel] [k] string_nocheck
1.90% libc-2.30.so [.] _IO_getdelim
1.86% [kernel] [k] __x86_indirect_thunk_rax
1.53% libc-2.30.so [.] _int_malloc
1.48% libc-2.30.so [.] __memmove_avx_unaligned_erms
1.40% [kernel] [k] clear_page_rep
1.07% perf [.] rb_insert_color
1.01% libc-2.30.so [.] _IO_feof
0.99% perf [.] __dso__load_kallsyms
0.98% [kernel] [k] s_next
0.96% perf [.] __rblist__findnew
0.95% [kernel] [k] strlen
0.95% perf [.] arch__symbols__fixup_end
0.94% libpixman-1.so.0.38.4 [.] 0x000000000006f4af
0.94% perf [.] symbol__new
0.89% libpixman-1.so.0.38.4 [.] 0x000000000006f4a0
0.86% [kernel] [k] seq_read
0.81% libpixman-1.so.0.38.4 [.] 0x000000000006f4ab
0.80% perf [.] __symbols__insert
0.73% libpixman-1.so.0.38.4 [.] 0x000000000006f4a7
0.67% [kernel] [k] s_show
0.66% libc-2.30.so [.] __libc_calloc
0.61% libpixman-1.so.0.38.4 [.] 0x000000000006f4bb
0.59% [kernel] [k] get_page_from_freelist
0.59% perf [.] memcpy@plt
0.58% perf [.] eprintf
exiting.
[perf@five ~]$

There is still something strange in here, the event is cycles:u (see at
the PerfTop line, but it is getting kernel samples :-\

- Arnaldo