Re: Have compiler remove __fentry locations from overwritten weak functions

From: Steven Rostedt
Date: Tue Oct 15 2024 - 10:01:02 EST


On Tue, 15 Oct 2024 10:01:49 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Mon, Oct 14, 2024 at 09:08:41PM -0400, Steven Rostedt wrote:
> > There's a long standing issue with having fentry in weak functions that
> > were overwritten. This was first caught when a "notrace" function was
> > showing up in the /sys/kernel/tracing/available_filter_functions list.
> >
> > https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@xxxxxxxxxx/
> >
> > What was happening is that ftrace uses kallsyms (what is basically listed
> > by nm)
>
> Ah, if only. NM lists the symbol table, and that has strictly more
> information than kallsyms, notably it includes symbol length. Using
> symbol length it is trivial to distinguish the case you're tripping
> over.
>
> > to map the location of __fentry to the previous symbol ahead of it.
> > Then it says that function is the name of the function for that __fentry.
> > But when you have weak functions, you can have this:
> >
> >
> > int notrace do_not_watch_me(void)
> > {
> > return whatever;
> > }
> >
> > void __weak define_me_in_arch(void)
> > {
> > return;
> > }
> >
> > The define_me_in_arch() gets a __fentry, but because it is overwritten, it
> > becomes attached to the end of do_not_watch_me(). When ftrace asks kallsyms
> > for the symbol name of that __fentry location, it returns
> > do_not_watch_me(), and ftrace will simply list that.
>
> Which is a kallsym bug that does not exists in the ELF space.
>
> There is a patch set that tries to fix this here:
>
> https://lore.kernel.org/all/20240723063258.2240610-1-zhengyejian@xxxxxxxxxxxxxxx/T/#u
>
> Instead of adding size to kallsyms it adds extra symbols to denote the
> holes, which ends up being similar.

Yeah, I was looking at this patch when I thought the compiler could do it
for us. Especially since it now creates the __mcount_locs too.

>
> > But I was thinking, since gcc (and perhaps clang?) now has:
> >
> > -mrecord-mcount
> >
> > that creates the __mcount_loc section that ftrace uses to find where the
> > __fentry's are. Perhaps the compiler can simply remove any of those
> > __fentry's that happen to exist in a weak function that was overridden.
> >
> > Would this be something that the compiler could do?
>
> Linker, this is link time. The linker would not only have to drop the
> (weak) symbol from the symbol table (which is all it really does), but
> it would now have to go and muck about with other sections.

Shucks, I was hoping that the linker would have enough info to know what
was inside the locations of those symbols. Oh well.

>
> It would have to go re-write the __mcount_loc section and drop entrie(s)
> that were inside the dropped symbol. But then, what about other extra
> sections? For instance, we can have (and still patch) alternatives in
> the 'dead' weak text.
>
> Where does this stop?
>
> LTO can do the right thing and completely eliminate the weak function,
> but anything short of that is a really big ask. Esp. since this is a
> problem of our own making -- kallsyms is a flawed representation of the
> symbol table.

Yeah, if it's too much of an ask, then forget it. But I decided to ask to
find out how much of an ask it is ;-)

-- Steve