Re: [PATCH 00/18] [ANNOUNCE] Dynamically created function based events
From: Masami Hiramatsu
Date: Mon Feb 05 2018 - 09:39:25 EST
On Sun, 4 Feb 2018 09:21:30 -0800
Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
> On Sun, Feb 04, 2018 at 12:57:47PM +0900, Masami Hiramatsu wrote:
> >
> > > I based some of the code from kprobes too. But I wanted this to be
> > > simpler, and as such, not as powerful as kprobes. More of a "poor mans"
> > > kprobe ;-) Where you are limited to functions and their arguments. If
> > > you need more power, switch to kprobes. In other words, its just an
> > > added stepping stone.
> > >
> > > Also, this should work without kprobe support, only ftrace, and function
> > > args from the arch.
> >
> > Hmm, but implementation seems very far from current probe events, we need
> > to consider how to unify it. Anyway, it is a very good time to do, because
> > I found current probe-event fetch method is not good with retpoline/IBRS,
> > it is full of indirect call.
> >
> > I would like to convert it to eBPF if possible. It will be good for the
> > performance with JIT, and we can collaborate on the same code with BPF
> > people.
>
> The current probe fetch method is indeed going to slow down due to
> retpoline, but this issue is going to affect not only this piece
> of code, but the rest of the kernel where indirect call performance
> matters a lot. Like networking stack where we have at least 4 indirect
> calls per packet.
> So I'd suggest to focus on finding a general method instead of coming
> with a specific solution for this kprobe fetching problem.
OK.
> Devirtualization approach works well and applicable in many cases.
> For networking stack deliver_skb() and __netif_receive_skb_core()
> can check if (pt_prev->func == ip_rcv || ipv6_rcv)
> and call them directly.
Yeah, if the options are limited, that works. (like replacing with
switch-case)
> The other approach I was thinking to explore is static_key-like
> for indirect calls. In many cases the target is rarely changed,
> so we can do arch specific rewrite of destination offset inside
> normal direct call instruction. That should be faster than retpoline.
I doubt it. Most of the indirect call uses are "ops->method" and
it depends on "ops".
> As far as emitting raw bpf insns instead of kprobe fetch methods
> there is a big problem with such apporach. Interpreter and all
> JITs take 'struct bpf_prog' that passed the verifier and not just
> random set of bpf instructions. BPF is not a generic assembler.
If you mean kernel/bpf/verifier.c, I'm happy with passing raw
bpf insns generated by kprobe-fetch-method to it :)
> BPF is an instruction set _with_ C calling convention.
> The registers and instructions must be used in certain way or
> things will horribly break.
> See Documentation/bpf/bpf_design_QA.txt for details.
> Long ago I wrote a patch that converted pred tree walk into
> raw bpf insns. If that patch made it into mainline back then
> it would have been a huge headache for us now.
> So if you plan on generating bpf programs they _must_ pass the verifier.
Yes, of course.
Anyway, it is just an idea for retpoline/Spectre V2. (yeah, it
is actual big issue, it makes the faster pointer-call method
slower. Now we see switch-case may be faster than that in some cases.)
I'm also considering to simplify it (or do it with branch and static
function call) as Steve did on this series.
Thank you,
--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>