Re: [PATCH v5 perf, bpf-next 3/7] perf, bpf: introduce PERF_RECORD_BPF_EVENT

From: Song Liu
Date: Tue Jan 08 2019 - 18:38:22 EST




> On Jan 8, 2019, at 12:16 PM, Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
>
> Em Tue, Jan 08, 2019 at 07:10:20PM +0000, Song Liu escreveu:
>>> On Jan 8, 2019, at 10:41 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>> On Thu, Dec 20, 2018 at 10:29:00AM -0800, Song Liu wrote:
>>>> @@ -986,9 +987,35 @@ enum perf_event_type {
>>>> */
>>>> PERF_RECORD_KSYMBOL = 17,
>>>>
>>>> + /*
>>>> + * Record bpf events:
>>>> + * enum perf_bpf_event_type {
>>>> + * PERF_BPF_EVENT_UNKNOWN = 0,
>>>> + * PERF_BPF_EVENT_PROG_LOAD = 1,
>>>> + * PERF_BPF_EVENT_PROG_UNLOAD = 2,
>>>> + * };
>>>> + *
>>>> + * struct {
>>>> + * struct perf_event_header header;
>>>> + * u16 type;
>>>> + * u16 flags;
>>>> + * u32 id;
>>>> + * u8 tag[BPF_TAG_SIZE];
>>>> + * struct sample_id sample_id;
>>>> + * };
>>>> + */
>>>> + PERF_RECORD_BPF_EVENT = 18,
>
>>> It was suggested to allow pinning modules/programs to avoid this
>>> situation, but that of course has other undesirable effects, such as a
>>> trivial DoS.
>>>
>>> A truly horrible hack would be to include an open filedesc in the event
>>> that needs closing to release the resource, but I'm sorry for even
>>> suggesting that **shudder**.
>>>
>>> Do we have any sane ideas?
>>
>> How about we gate the open filedesc solution with an option, and limit
>> that option for root only? If this still sounds hacky, maybe we should
>> just ignore when short-living programs are missed?
>
> Short lived short programs could go in the event? Short lived long
> events.. One could ask for max number of bytes of binary?
>
> The smallest kernel modules are 16KB, multiple of PAGE_SIZE:
>
> [acme@quaco perf]$ cat /proc/modules | sort -k2 -nr | tail
> ebtable_nat 16384 1 - Live 0x0000000000000000
> ebtable_filter 16384 1 - Live 0x0000000000000000
> crct10dif_pclmul 16384 0 - Live 0x0000000000000000
> crc32_pclmul 16384 0 - Live 0x0000000000000000
> coretemp 16384 0 - Live 0x0000000000000000
> btrtl 16384 1 btusb, Live 0x0000000000000000
> btbcm 16384 1 btusb, Live 0x0000000000000000
> arc4 16384 2 - Live 0x0000000000000000
> acpi_thermal_rel 16384 1 int3400_thermal, Live 0x0000000000000000
> ac97_bus 16384 1 snd_soc_core, Live 0x0000000000000000
> [acme@quaco perf]$
>
> On a Fedora 29 I have these here, all rather small:
>
> # bpftool prog
> 13: cgroup_skb tag 7be49e3934a125ba gpl
> loaded_at 2019-01-04T14:40:32-0300 uid 0
> xlated 296B jited 229B memlock 4096B map_ids 13,14
> 14: cgroup_skb tag 2a142ef67aaad174 gpl
> loaded_at 2019-01-04T14:40:32-0300 uid 0
> xlated 296B jited 229B memlock 4096B map_ids 13,14
> 15: cgroup_skb tag 7be49e3934a125ba gpl
> loaded_at 2019-01-04T14:40:32-0300 uid 0
> xlated 296B jited 229B memlock 4096B map_ids 15,16
> 16: cgroup_skb tag 2a142ef67aaad174 gpl
> loaded_at 2019-01-04T14:40:32-0300 uid 0
> xlated 296B jited 229B memlock 4096B map_ids 15,16
> 17: cgroup_skb tag 7be49e3934a125ba gpl
> loaded_at 2019-01-04T14:40:43-0300 uid 0
> xlated 296B jited 229B memlock 4096B map_ids 17,18
> 18: cgroup_skb tag 2a142ef67aaad174 gpl
> loaded_at 2019-01-04T14:40:43-0300 uid 0
> xlated 296B jited 229B memlock 4096B map_ids 17,18
> 21: cgroup_skb tag 7be49e3934a125ba gpl
> loaded_at 2019-01-04T14:40:43-0300 uid 0
> xlated 296B jited 229B memlock 4096B map_ids 21,22
> 22: cgroup_skb tag 2a142ef67aaad174 gpl
> loaded_at 2019-01-04T14:40:43-0300 uid 0
> xlated 296B jited 229B memlock 4096B map_ids 21,22
> [root@quaco IRPF2018]#
>
>
> Running 'perf trace' with its BPF augmenter get these two more:
>
> 158: tracepoint name sys_enter tag 12504ba9402f952f gpl
> loaded_at 2019-01-08T17:12:39-0300 uid 0
> xlated 512B jited 374B memlock 4096B map_ids 118,117,116
> 159: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl
> loaded_at 2019-01-08T17:12:39-0300 uid 0
> xlated 256B jited 191B memlock 4096B map_ids 118,117
> [root@quaco ~]#
>
> A PERF_RECORD_MMAP gets as its payload up to PATH_MAX - sizeof(u64).
>
> So for a class of programs, shoving it together with the
> PERF_RECORD_MMAP like event may be enough?
>
> You started the shuddering suggestions... ;-)
>
> - Arnaldo

Besides the cited binary, we are adding more information for each
BPF program, including source code. So even short program could
easily exceed PATH_MAX...

Song