Re: [PATCH v2 15/39] x86/ibt,kprobes: Fix more +0 assumptions

From: Masami Hiramatsu
Date: Fri Feb 25 2022 - 21:10:51 EST


On Fri, 25 Feb 2022 16:41:15 +0100
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Fri, Feb 25, 2022 at 10:42:49PM +0900, Masami Hiramatsu wrote:
>
> > OK, this sounds like kp->addr should be "call fentry" if there is ENDBR.
> >
> > >
> > > This patch takes the approach that sym+0 means __fentry__, irrespective
> > > of where it might actually live. I *think* that's more or less
> > > consistent with what other architectures do; specifically see
> > > arch/powerpc/kernel/kprobes.c:kprobe_lookup_name(). I'm not quite sure
> > > what ARM64 does when it has BTI on (which is then very similar to what
> > > we have here).
> >
> > Yeah, I know the powerpc does such thing, but I think that is not what
> > user expected. I actually would like to fix that, because in powerpc
> > and other non-x86 case (without BTI/IBT), the instructions on sym+0 is
> > actually executed.
> >
> > >
> > > What do you think makes most sense here?
> >
> > Are there any way to distinguish the "preparing instructions" (part of
> > calling mcount) and this kind of trap instruction online[1]? If possible,
> > I would like to skip such traps, but put the probe on preparing
> > instructions.
>
> None that exist, but we could easily create one. See also my email here:
>
> https://lkml.kernel.org/r/Yhj1oFcTl2RnghBz@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> That skip_endbr() function is basically what you're looking for; it just
> needs a better name and a Power/ARM64 implementation to get what you
> want, right?

Great! that's what I need. I think is_endbr() is also useful :)

> The alternative 'hack' I've been contemplating is (ab)using
> INT_MIN/INT_MAX offset for __fentry__ and __fexit__ points (that latter
> is something we'll probably have to grow when CET-SHSTK or backward-edge
> CFI gets to be done, because then ROP tricks as used by function-graph
> and kretprobes are out the window).
>
> That way sym+[0..size) is still a valid reference to the actual
> instruction in the symbol, but sym+INT_MIN will hard map to __fentry__
> while sym+INT_MAX will get us __fexit__.

Interesting, is that done by another series?
Maybe I have to check that change for kprobe jump optimization.

Thank you,

--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>