Re: [PATCH v3 0/2] kstats: kernel metric collector

From: Alexei Starovoitov
Date: Wed Feb 26 2020 - 11:31:38 EST


On Wed, Feb 26, 2020 at 7:03 AM Toke HÃiland-JÃrgensen <toke@xxxxxxxxxx> wrote:
>
> > The tracepoint/kprobe/kretprobe solution is much more expensive --
> > from my measurements, the hooks that invoke the various handlers take
> > ~250ns with hot cache, 1500+ns with cold cache, and tracing an empty
> > function this way reports 90ns with hot cache, 500ns with cold cache.
>
> I think it would be good if you could include an equivalent BPF-based
> implementation of your instrumentation example so people can (a) see the
> difference for themselves and get a better idea of how the approaches
> differ in a concrete case and (b) quantify the difference in performance
> between the two implementations.

+1

kprobe/kretprobe are expensive.
That was the reason we switched to bpf fentry/fexit based on bpf trampoline.
The overhead is close to zero. Currently it's used to collect stats for
bpf programs themselves, but the framework is there to collect these
stats for any kernel function. Please see:
https://lore.kernel.org/bpf/20200213210115.1455809-1-songliubraving@xxxxxx/T/#mae90f23e545f03bde837239e159909f4e4a1acaa
One of the ideas that came up during discussion is to
teach 'perf stat' to do the same.
So the kernel has all the facilities to instrument itself.
Only user space work left.