Re: [PATCH v2 15/39] x86/ibt,kprobes: Fix more +0 assumptions
From: Naveen N. Rao
Date: Thu Mar 03 2022 - 07:12:23 EST
Peter Zijlstra wrote:
On Wed, Mar 02, 2022 at 08:32:45PM +0100, Peter Zijlstra wrote:
I wonder if you also want to tighten up on_func_entry? Wouldn't the
above suggest something like:
Good question ;)
I noticed this yesterday, but held off on making changes so that I can
think this through.
kprobe_opcode_t *arch_adjust_kprobe_addr(unsigned long addr, unsigned long offset,
bool *on_func_entry)
{
#ifdef PPC64_ELF_ABI_V2
unsigned long entry = ppc_function_entry((void *)addr) - addr;
*on_func_entry = !offset || offset == entry;
if (*on_func_entry)
offset = entry;
#else
*on_func_entry = !offset;
#endif
return (void *)(addr + offset);
}
This is more accurate and probably something we should do in the long
term. The main issue I see today is userspace, specifically perf (and
perhaps others too).
Historically, we have worked around the issue of probes on a function's
global entry point not working by having userspace adjust the offset at
which probes are placed. This works well if those object files have
either the symbol table, or debuginfo capturing if functions have a
separate local entry point. In the absence of those, we are left
guessing and we chose to just offset all probes at function entry by 8
(GEP almost always has the same two instructions) so that perf "just
works". This still works well for functions without a GEP since we
expect to see the two ftrace instructions at function entry, so we are
ok to probe after that. As an added bonus, this also allows uprobes to
work, for the most part.
On the kernel side, we only implemented logic to adjust probe address if
a function name was specified without an offset. This went for a toss
once perf probe moved to using _text as the base symbol for kprobes
though, and we weren't handling scenarios where addr+offset was
provided. With the changes in this series, we can now adjust kprobe
address across all those scenarios properly.
If we update perf to not pass an offset any more, then newer perf will
stop working on older kernels. If we make the logic to determine
function entry strict in the kernel, then we risk breaking existing
userspace.
I'm not sure how best to address this.
One question though; the above seems to work for +0 or +8 (IIRC your
instructions are 4 bytes each and the GEP is 2 instructions).
But what do we want to happen for +4 ?
We don't want to change the behavior of probes at the second instruction
in GEP. The thinking is that it allows the rare scenario (if at all) of
wanting to catch indirect function calls, and/or cross-module function
calls -- especially since we now promote probes at GEP to LEP. I frankly
know of no such scenarios so far, but in any case, if the user is
specifying an offset, they better know what they are asking for :)
For the same reason, we should allow kretprobe at +4.
Thanks,
Naveen