Re: [RFC PATCH] ftrace: Skip __fentry__ location of overridden weak functions

From: Peter Zijlstra
Date: Fri Jun 07 2024 - 11:05:51 EST


On Fri, Jun 07, 2024 at 07:52:11PM +0800, Zheng Yejian wrote:
> ftrace_location() was changed to not only return the __fentry__ location
> when called for the __fentry__ location, but also when called for the
> sym+0 location after commit aebfd12521d9 ("x86/ibt,ftrace: Search for
> __fentry__ location"). That is, if sym+0 location is not __fentry__,
> ftrace_location() would find one over the entire size of the sym.
>
> However, there is case that more than one __fentry__ exist in the sym
> range (described below) and ftrace_location() would find wrong __fentry__
> location by binary searching, which would cause its users like livepatch/
> kprobe/bpf to not work properly on this sym!
>
> The case is that, based on current compiler behavior, suppose:
> - function A is followed by weak function B1 in same binary file;
> - weak function B1 is overridden by function B2;
> Then in the final binary file:
> - symbol B1 will be removed from symbol table while its instructions are
> not removed;
> - __fentry__ of B1 will be still in __mcount_loc table;
> - function size of A is computed by substracting the symbol address of
> A from its next symbol address (see kallsyms_lookup_size_offset()),
> but because symbol info of B1 is removed, the next symbol of A is
> originally the next symbol of B1. See following example, function
> sizeof A will be (symbol_address_C - symbol_address_A):
>
> symbol_address_A
> symbol_address_B1 (Not in symbol table)
> symbol_address_C
>
> The weak function issue has been discovered in commit b39181f7c690
> ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function")
> but it didn't resolve the issue in ftrace_location().
>
> There may be following resolutions:

Oh gawd, sodding weak functions again.

I would suggest changing scipts/kallsyms.c to emit readily identifiable
symbol names for all the weak junk, eg:

__weak_junk_NNNNN

That instantly fixes the immediate problem and Steve's horrid hack can
go away.

Additionally, I would add a boot up pass that would INT3 fill all such
functions and remove/invalidate all
static_call/static_jump/fentry/alternative entry that is inside of them.