Re: ftrace: Save module init functions kallsyms symbols for tracing

From: Jessica Yu
Date: Wed Oct 11 2017 - 07:19:53 EST


+++ Steven Rostedt [10/10/17 15:45 +0000]:
On Mon, 9 Oct 2017 15:51:00 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:


> Using the snd_seq example from patch 03, I get the same human-readable trace:
>
> # tracer: function
> #
> # _-----=> irqs-off
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / delay
> # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> # | | | |||| | |
> modprobe-12748 [005] .... 138.964923: alsa_seq_init <-do_one_initcall
> modprobe-12748 [005] .... 138.964924: client_init_data <-alsa_seq_init
> modprobe-12748 [005] .... 138.964924: snd_sequencer_memory_init <-alsa_seq_init
> modprobe-12748 [005] .... 138.964924: snd_seq_queues_init <-alsa_seq_init
> modprobe-12748 [005] .... 138.964924: snd_sequencer_device_init <-alsa_seq_init
> modprobe-12748 [005] .... 138.964956: snd_seq_info_init <-alsa_seq_init
> modprobe-12748 [005] .... 138.964956: create_info_entry <-snd_seq_info_init
> modprobe-12748 [005] .... 138.964958: create_info_entry <-snd_seq_info_init
> modprobe-12748 [005] .... 138.964959: create_info_entry <-snd_seq_info_init
> modprobe-12748 [005] .... 138.964960: snd_seq_system_client_init <-alsa_seq_init
> modprobe-12748 [005] .... 138.964960: snd_seq_create_kernel_client <-snd_seq_system_client_init
>
> In addition, the module init functions appear in /proc/kallsyms as
> well. Thoughts? Would saving init symbol information in the module
> symtab be sufficient for ftrace?

Hi Jessica,

Are you going to have time to work on this more?

Should we allow multiple init addresses to appear, or should there be
some check to see if they already exist, and remove the older one?

In /proc/kallsyms? IMO it would be better to make the behavior as
general as possible, as in, we just keep all symbols in each module's
symtab without regard to other modules. I think the code could quickly
get messy and complex if we're cross-checking other module symtabs
every time a module is loaded/unloaded to find which __init symbols
we should no longer keep.

Also worth noting, kallsyms already returns "live" symbols first
before __init ones in this patch, since module_address_lookup() is
called before ftrace_mod_address_lookup(). But I still need to look
into handling collisions in the case of multiple module init funcs
with the same addr, in this case I think we agree that kallsyms should
favor the last loaded module.

Ideally, I could have a way to know if a module init function was
traced and its address was stored in a ring buffer, and keep the init
mapping around and used for that buffer, and free it when the buffer
is cleared, and there's no other buffers that have a reference to it.
But I need to find a low overhead way to account for this.

Personally, I would rather have the mappings saved in the symbol tables
in the module code.

If we can come up with a better module way, I'll convert to that. But
as I want to get this code upstream by 4.15, I'm going to continue with
what I have.

Thoughts?

Hm, I am not sure how to fix the symbol mix-up issue when reading out the
trace. I will try to clarify the limitation with an example...

# rmmod snd_seq
# echo ':mod:snd_seq' > set_ftrace_filter
# echo function > current_tracer
# modprobe snd_seq
# cat trace
# tracer: function
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
modprobe-15680 [000] .... 882.128680: alsa_seq_init <-do_one_initcall
modprobe-15680 [000] .... 882.128683: client_init_data <-alsa_seq_init
modprobe-15680 [000] .... 882.128683: snd_sequencer_memory_init <-alsa_seq_init
modprobe-15680 [000] .... 882.128683: snd_seq_queues_init <-alsa_seq_init
modprobe-15680 [000] .... 882.128683: snd_sequencer_device_init <-alsa_seq_init
modprobe-15680 [000] .... 882.128765: snd_seq_info_init <-alsa_seq_init
modprobe-15680 [000] .... 882.128765: create_info_entry <-snd_seq_info_init
modprobe-15680 [000] .... 882.128770: create_info_entry <-snd_seq_info_init
modprobe-15680 [000] .... 882.128771: create_info_entry <-snd_seq_info_init
modprobe-15680 [000] .... 882.128772: snd_seq_system_client_init <-alsa_seq_init
modprobe-15680 [000] .... 882.128772: snd_seq_create_kernel_client <-snd_seq_system_client_init
modprobe-15680 [000] .... 882.128773: seq_create_client1 <-snd_seq_create_kernel_client
modprobe-15680 [000] .... 882.128773: snd_seq_pool_new <-seq_create_client1

[output snipped]

OK, so far so good. But as we load/unload modules, and as addresses get reused,
our trace output starts to get messed up:

# modprobe xxhash
# cat trace
# tracer: function
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
modprobe-15680 [000] .... 882.128680: xxh32 <-do_one_initcall
modprobe-15680 [000] .... 882.128683: xxh32 <-xxh32
modprobe-15680 [000] .... 882.128683: xxh32 <-xxh32
modprobe-15680 [000] .... 882.128683: xxh32 <-xxh32
modprobe-15680 [000] .... 882.128683: xxh32 <-xxh32
modprobe-15680 [000] .... 882.128765: xxh64 <-xxh32
modprobe-15680 [000] .... 882.128765: xxh64 <-xxh64
modprobe-15680 [000] .... 882.128770: xxh64 <-xxh64
modprobe-15680 [000] .... 882.128771: xxh64 <-xxh64
modprobe-15680 [000] .... 882.128772: xxh32 <-xxh32
modprobe-15680 [000] .... 882.128772: snd_seq_create_kernel_client <-xxh64
modprobe-15680 [000] .... 882.128773: seq_create_client1 <-snd_seq_create_kernel_client

% grep ffffffffc0d16000 /proc/kallsyms
ffffffffc0d16000 T xxh32 [xxhash]
ffffffffc0d16000 t alsa_seq_init [snd_seq]
ffffffffc0d16000 t init_module [snd_seq]
ffffffffc0d16000 t crypto_ccm_module_init [ccm]
ffffffffc0d16000 t init_module [ccm]

Since the address lookups in the trace output are being done by kallsyms,
kallsyms will return the "live" symbol xxh32 from the xxhash module (which is
in the module core memory, it's not an __init function) rather than the defunct
alsa_seq_init symbol in freed module init memory. This is because
module_address_lookup(), called from kallsyms_lookup, only takes core module
syms into account. I think this behavior makes sense, but perhaps it's not
completely suited for ftrace's use case here?

Maybe if there is some way to know which module(s) are being actively traced,
then it is possible to narrow the symbol search space to just those modules'
symbol table(s) when displaying the trace output. So if we know that functions
from snd_seq are being traced and *not* functions from xxhash, then we'd find
alsa_seq_init and not xxh32 for example. Do you think something like this would
be possible to do? In this case, ftrace might need something more specialized
than kallsyms_lookup()..
And yeah, I would rather eventually keep the module init function ->
address mapping in the module symbol tables. The information is
already there, we just don't make it available. __init syms from the
core kernel stay listed in /proc/kallsyms, so I think it's OK to do
this for modules too. And ftrace wouldn't need to do the extra
bookkeeping.

Unfortunately, my next two weeks are looking grim, so I don't think I
can invest too much time on this before the 4.15 merge window :-/ I
don't want to slow you down, so for now please continue with what you
have, but I can gladly revisit this issue with you after the merge
window.

Thanks!

Jessica