Re: [RFC PATCH -next v2 3/4] arm64/ftrace: support dynamically allocated trampolines

From: Masami Hiramatsu
Date: Wed May 04 2022 - 23:15:52 EST


Hi Mark,

On Wed, 4 May 2022 11:24:43 +0100
Mark Rutland <mark.rutland@xxxxxxx> wrote:

> On Tue, Apr 26, 2022 at 05:47:49PM +0900, Masami Hiramatsu wrote:
> > Hi Mark,
> >
> > On Fri, 22 Apr 2022 18:27:53 +0100
> > Mark Rutland <mark.rutland@xxxxxxx> wrote:
> >
> > > On Fri, Apr 22, 2022 at 11:45:41AM -0400, Steven Rostedt wrote:
> > > > On Fri, 22 Apr 2022 11:12:39 +0100
> > > > Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > > >
> > > > > As an aside, I'd also love to remove the REGS/!REGs distinction, and always
> > > > > save a minimum amount of state (like ARGS, but never saving a full pt_regs),
> > > > > since on arm64 the extra state stored for the REGS case isn't useful (and we
> > > > > can't reliably capture all of the pt_regs state anyway, so bits of it are made
> > > > > up or not filled in).
> > > >
> > > > Note, the reason for the addition of REGS was a requirement of kprobes.
> > > > Because before ftrace, kprobes would be triggered at the start of a
> > > > function by a breakpoint that would load in all the regs. And for backward
> > > > compatibility, Masami wanted to make sure that kprobes coming from ftrace
> > > > had all the regs just like it had when coming from a breakpoint.
> >
> > Yes. Since this kprobes->ftrace conversion is done by kprobes transparently,
> > user doesn't know their kprobe handler is called from sw break or ftrace.
>
> The big problem is that on arm64 kprobes and ftrace are *very* different, and
> we can't do that transparently (unless both had a mode which just provided the
> ARGS).

OK, in that case, we can just drop the KPROBE_ON_FTRACE support on arm64.
A problem I thought in this case, is that the fprobe which I introduced recently
for eBPF is expecting to access pt_regs. I would like to know that what kind
of limitation exists if the ftrace on arm64 makes pt_regs. Only PSTATE is
a problem?

>
> > > > IIUC, kprobes is the only reason we have the "regs" variant (all other use
> > > > cases could get by with the ARGS version).
> > >
> > > I see. FWIW, we don't have KPROBES_ON_FTRACE on arm64.
> >
> > Right. Since x86 fentry puts the entry on function address, I need such
> > compatibility.
> >
> > But on arm64, ftrace leads some preparation instructions, kprobes can put
> > the sw break on the function address there. And may not need to put the
> > kprobes on it. So it depends on arch. I would like to keep the kprobes
> > available at the function address so that it can trace any registers.
> > (like debugger usage)
> >
> > > Also, the same problems apply to KRETPROBES: the synthetic `pstate`
> > > value is bogus and we don't fill in other bits of the regs (e.g. the PMR
> > > value), so it's not a "real" pt_regs, and things like
> > > interrupts_enabled(regs) won't work correctly.
> >
> > Would you mean the process which kprobes_save/restore_local_irqflag() does?
> > Is the regs->pstate saved correctly in sw break or ftrace? (sorry, I missed
> > the context)
>
> For `BRK` (SW break) instructions we take an exception, PSTATE (and all of the
> struct pt_regs) is saved correctly.

Great, thanks for the confirmation!

>
> For ftrace, PSTATE (and other bits of pt_regs) are not saved correctly.
> Practically speaking it's not feasible to do so reliably without taking an
> exception, which is why I'd like to reduce ftrace down to just the ARGs.

OK. But my interest is that the ftrace on arm64 can provide a limited
access to registers via pt_regs or not. I don't mind the contained values
so much because in the most case, 'users' will (most likely) access to the
ARGs via BPF or tracefs (and we can just warn users if they try to access
the registers which is not saved.) But if the arm64 ftrace only provides
a special data structure, arch-independent code must have 2 different access
code. That is inefficient. That is my concern.
IOW, I'm interested in interface abstraction.

> > > In addition, as
> > > KRETPROBES only hooks function entry/exit and x9-x17 + x19-x28 are
> > > meaningless at those times, no-one's going to care what they contain
> > > anyway.
> >
> > It depends on what bug they are trying to trace. C source level bug
> > will not need such information, but assembly level bug (or compiler
> > level bug) may need such registers. Anyway, this also depends on user.
> > I just won't like limit the usage.
>
> If that's how kretprobes are intended to be used, then I think they must
> *always* use a BRK as that's the only way to reliably get a complete struct
> pt_regs.
>
> I've implemented that:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=arm64/kprobes/kretprobe-brk-trampoline&id=80be4ccbf47b0294a02b05b797cbff36364bc435
>
> ... and can send that out as a patch shortly.

Hm, this is reasonable, and not hard to port it for rethook on arm64.

Just FYI, the rethook will be used for kprobe and fprobe (ftrace based
probe, which was introduced for batch BPF probe [1]). kprobe will be
used for assembly level debug, but fprobe may not. Thus, eventually
those should have 2 different trampolines specified by rethook 'flags'.
But for the first step, I think we can use BRK trampoline by default.

[1] https://lore.kernel.org/all/20220316122419.933957-1-jolsa@xxxxxxxxxx/T/#u

> > > The state we can correctly snapshot (and that would be useful)
> > > is the same as ARGS.
> > >
> > > It'd be nice if KRETPROBES could just use ARGS, but a standard KPROBE
> > > that traps could provide regs (since it actually gets "real" regs, and
> > > within a function the other GPRs could be important).
> >
> > Here, the KRETPROBES means the exit handler, or including entry handler?
> > Since kretprobes uses a standard kprobe to trap the function entry.
>
> I'm suggesting that (if there are cases where kretprobes are only used to
> acquire arguments and return vaalues, and not other state), we change things so
> that kretprobes can use a different entry handler from a regular kprobe, and
> that new handler only has to acquire the arguments and return values, matching
> the ftrace ARGS.

I would like to move this feature on fprobe. Since the kretprobe is used
widely, it is hard to change the interface. Now fprobe is only used for
BPF, so I think there is a chance to change it.

So there are 2 options I think.

- Keep fprobe and kprobe current interface (use pt_regs) but only the registers
for argument and return values are accessible (hopefully, the interrupt state
etc. too) from fprobes. With this, the BPF can use same program code for
kprobes and fprobe, but BPF compiler will be able to reject user program
which access the non-saved registers.

- Introduce a new data structure which saves argument and return value for
fprobe. BPF runtime program and BPF compiler needs to be changed for this.

>
> That way we can have:
>
> a) A regular kprobe, which uses BRK as today, and gets a full pt_regs
>
> b) A regular kretprobe, which uses BRK for both entry/exit, and gets a full
> pt_regs in both cases.
>
> c) An args-only kretprobe, where the entry/exit handlers only present the ARGS
> to the registered handlers.
>
> If kretprobes always needs the full pt_regs, then (c) isn't necessary, and we
> don't need to change anything more than the kretprobes trampoline, as above.

I'm OK for the args-only, but that is not kretprobe, we need another one
and should remove current kretprobe in this case for avoiding confusion.
I think fprobe is a good candidate because it is not widely used yet.
Anyway, we need to talk with BPF people about this idea.

As a compromise plan, keep using pt_regs but filling a part of them on arm64.


> What I really want to get away from it kretprobes and ftrace having an
> incomplete pt_regs, and the two ways of doing that are:
>
> 1) Save/restore the full pt_regs by taking an exception.
>
> 2) Save/restore a minimal ARGS structure, and ensure the interfaces don't have
> to pass a struct pt_regs pointer.
>
> For kretprobes I can implement either, but for ftrace (2) is the only real
> option.

Hmm, I'm still not sure why you can't pass an incomplete pt_regs? (to optimize
stack usage??)

Anyway, I think your kretprobe patch is perfect. I think it is good for kretprobe.

Thank you,

>
> Thanks,
> Mark.
>
> > If you talk about fprobes (ftrace probe interface), it will only use the
> > ftrace. Thus your idea is acceptable for it (because fprobe is different
> > from kprobes *).
> >
> > * Of course we have to talk with BPF people so that they will only access
> > ARGS from BPF program on fprobes.
> >
> > Thank you,
> >
> > --
> > Masami Hiramatsu <mhiramat@xxxxxxxxxx>


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>