Re: [PATCH v10 perf, bpf-next 1/9] perf, bpf: Introduce PERF_RECORD_KSYMBOL

From: Song Liu
Date: Thu Jan 17 2019 - 10:04:09 EST




> On Jan 17, 2019, at 6:58 AM, Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
>
> Em Thu, Jan 17, 2019 at 02:49:10PM +0000, Song Liu escreveu:
>>
>>
>>> On Jan 17, 2019, at 4:56 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>>
>>> On Wed, Jan 16, 2019 at 08:29:23AM -0800, Song Liu wrote:
>>>> For better performance analysis of dynamically JITed and loaded kernel
>>>> functions, such as BPF programs, this patch introduces
>>>> PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
>>>> register/unregister information to user space.
>>>>
>>>> The following data structure is used for PERF_RECORD_KSYMBOL.
>>>>
>>>> /*
>>>> * struct {
>>>> * struct perf_event_header header;
>>>> * u64 addr;
>>>> * u32 len;
>>>> * u16 ksym_type;
>>>> * u16 flags;
>>>> * char name[];
>>>> * struct sample_id sample_id;
>>>> * };
>>>> */
>>>
>>> So I've cobbled together the attached patches to see how it would work
>>> out..
>>>
>>> I didn't convert ftrace trampolines; because ftrace code has this
>>> uncanny ability to make my head hurt. But I don't think it should be
>>> hard, once one figures out the right structure to stick that
>>> kallsym_node thing in (ftrace_ops ?!).
>>>
>>> It is compiled only, so no testing what so ever (also, no changelogs).
>>>
>>> I didn't wire up the KSYM_TYPE thing; I'm wondering if we really need
>>> that, OTOH it really doesn't hurt having it either.
>>>
>>> One weird thing I noticed, wth does bpf_prog_kallsyms_add() check
>>> CAP_ADMIN ?! Surely even a non-priv JIT'ed program generates symbols,
>>> why hide those?
>>>
>>> Anyway; with the one nit about the get_names() thing sorted:
>>>
>>> Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>>>
>>> (thanks for sticking with this)
>>> <peterz-latch-next.patch><peterz-kallsym.patch><peterz-kallsym-bpf.patch>
>>
>> Aha, now I get the point on perf_event_ksymbol(). Yeah this approach is
>> definitely better.
>>
>> While I run more tests with these patches, could we get current in
>> perf/core? This will enable the development of user space tools like
>> bcc.
>>
>> Also, I current base this set on bpf-next tree, as tip/perf/core is
>> 4 week old. Shall I rebase the set on Linus' tree? Or shall I wait for
>> tip/perf/core?
>
> So, can you post one last set, this time with PeterZ's Acked-by,
> assuming you're sorting out the get_names() thing, and I can try merging
> this into my perf/core branch, then pushing it out to Ingo, my perf/core
> starts from tip/perf/urgent, so should be new enough.
>
> I'd then right after testing it send a pull request to Ingo, synching
> everything.
>
> - Arnaldo

Thanks Arnaldo! I will send it soon.

Song