Re: ftrace module init race/bug

From: Steven Rostedt
Date: Mon Nov 07 2016 - 15:09:01 EST


On Mon, 7 Nov 2016 11:46:24 -0800
Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:

> Hi Steven,
>
> I'm debugging the following spew:
>
> [ 7139.111213] WARNING: CPU: 8 PID: 856649 at kernel/trace/ftrace.c:2747 ftrace_shutdown+0x263/0x270

What kernel version is this, as that line number doesn't line up to
nay warning on 4.9 on 4.9-rc1.


> [ 7139.753185] 0000000000000000 ffff881fc59eba40 ffffffff814717d1 0000000000000000
> [ 7139.817019] 0000000000000000 ffff881fc59eba80 ffffffff81076e43 00000abb8179b000
> [ 7139.880790] ffff881f7f3ee000 ffffffff81d1a820 0000000000000000 ffff881ff0929860
> [ 7139.945423] Call Trace:
> [ 7139.966365] [<ffffffff814717d1>] dump_stack+0x4d/0x6c
> [ 7140.010927] [<ffffffff81076e43>] __warn+0xd3/0xf0
> [ 7140.052228] [<ffffffff81076f2d>] warn_slowpath_null+0x1d/0x20
> [ 7140.102523] [<ffffffff811135c3>] ftrace_shutdown+0x263/0x270
> [ 7140.152065] [<ffffffff8111399e>] unregister_ftrace_function+0x1e/0x50
> [ 7140.208374] [<ffffffff81108cfb>] disarm_kprobe+0xdb/0x100
> [ 7140.256014] [<ffffffff81108da5>] __disable_kprobe+0x85/0x90
> [ 7140.305474] [<ffffffff81108dd1>] disable_kprobe+0x21/0x40
> [ 7140.352779] [<ffffffff8113264d>] disable_trace_kprobe+0xad/0xd0
> [ 7140.404624] [<ffffffff8113325d>] kprobe_register+0x2d/0xf0
> [ 7140.452625] [<ffffffff8112cf33>] perf_trace_event_unreg.isra.1+0x33/0x90
> [ 7140.511154] [<ffffffff8112d276>] perf_trace_destroy+0x36/0x50
> [ 7140.561450] [<ffffffff81145649>] tp_perf_event_destroy+0x9/0x10
> [ 7140.612590] [<ffffffff8114b777>] _free_event+0xb7/0x260
> [ 7140.658544] [<ffffffff8114bda4>] put_event+0x14/0x20
> [ 7140.702069] [<ffffffff8114bfb8>] perf_event_release_kernel+0x208/0x280
> [ 7140.759342] [<ffffffff8114c040>] perf_release+0x10/0x20
> [ 7140.805201] [<ffffffff811d62c6>] __fput+0xd6/0x210
> [ 7140.846497] [<ffffffff811d643e>] ____fput+0xe/0x10
> [ 7140.888797] [<ffffffff81091597>] task_work_run+0x77/0x90
> [ 7140.935328] [<ffffffff810799eb>] do_exit+0x2bb/0xb30
> [ 7140.979603] [<ffffffff8107a2df>] do_group_exit+0x3f/0xa0
> [ 7141.026167] [<ffffffff81084405>] get_signal+0x1b5/0x5d0
> [ 7141.071950] [<ffffffff8101b1e8>] do_signal+0x28/0x690
> [ 7141.116436] [<ffffffff81216f40>] ? ep_ptable_queue_proc+0xa0/0xa0
> [ 7141.170546] [<ffffffff812175e5>] ? ep_scan_ready_list.isra.12+0x185/0x1a0
> [ 7141.229848] [<ffffffff81677c25>] ? __sys_sendmsg+0x65/0x80
> [ 7141.277902] [<ffffffff81072442>] exit_to_usermode_loop+0x53/0x96
> [ 7141.330472] [<ffffffff8100294b>] do_syscall_64+0x8b/0xa0
> [ 7141.377005] [<ffffffff8179333c>] entry_SYSCALL64_slow_path+0x25/0x25
> [ 7141.432829] ---[ end trace 77eec51ed01e77ca ]---
> [ 7141.473508] fuse_dev_open+0x0/0x20 [fuse] flags:2000000
> [ 7141.520206] fuse_request_init+0x0/0xb0 [fuse] flags:2000000
> [ 7141.569308] fuse_req_init_context+0x0/0x60 [fuse] flags:2000000
> [ 7141.622772] queue_request+0x0/0x80 [fuse] flags:2000000
> [ 7141.670397] fuse_retrieve_end+0x0/0x30 [fuse] flags:2000000
> [ 7141.720671] fuse_dev_fasync+0x0/0x30 [fuse] flags:2000000
> [ 7141.769842] fuse_dev_ioctl+0x0/0x100 [fuse] flags:2000000
> [ 7141.820094] lock_request+0x0/0x50 [fuse] flags:2000000
> [ 7141.870199] unlock_request+0x0/0x50 [fuse] flags:2000000
> [ 7141.920406] fuse_dev_poll+0x0/0x90 [fuse] flags:2000000
> [ 7141.968565] flush_bg_queue+0x0/0xd0 [fuse] flags:2000000
> [ 7142.016531] queue_interrupt+0x0/0x80 [fuse] flags:2000000
> ...
> [ 7153.127914] fuse_get_root_inode+0x0/0x4c [fuse] flags:2000000
> [ 7153.179337] ------------[ cut here ]------------
> [ 7153.219335] WARNING: CPU: 8 PID: 856649 at kernel/kprobes.c:959 disarm_kprobe+0xbe/0x100
> [ 7153.289424] Failed to disarm kprobe-ftrace at ffffffff816f3ae0 (-19)
> [ 7153.344466] Modules linked in: fuse(+) nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT nf_reject_ipv4 xt_limit xt_multiport ip6_tunnel tunnel6 ipip ip_tunnel tunnel4 coretemp ipmi_watchdog tcp_diag inet_diag ip6table_filter xt_NFLOG nfnetlink_log xt_comment xt_statistic iptable_filter xt_mark tpm_crb mptctl mptbase i2c_piix4 dm_crypt loop ipmi_devintf acpi_cpufreq iTCO_wdt iTCO_vendor_support ipmi_si ipmi_msghandler efivars i2c_i801 i2c_core sg lpc_ich mfd_core hpilo xhci_pci xhci_hcd mpt3sas scsi_transport_sas raid_class button
> [ 7153.794823] CPU: 8 PID: 856649 Comm: tcpeventd Tainted: G W 4.6.0-5_fbk1_223_gdbf0f40 #1
> [ 7153.874380] Hardware name: HP ProLiant DL380 Gen9, BIOS P89 03/10/2015
> [ 7153.930676] 0000000000000000 ffff881fc59eba38 ffffffff814717d1 ffff881fc59eba88
> [ 7153.994469] 0000000000000000 ffff881fc59eba78 ffffffff81076e43 000003bfdb2095c0
> [ 7154.058296] ffff883eecf78810 ffff883eecf78810 ffff883e945ed908 ffff883edb2095c0
> [ 7154.122087] Call Trace:
> [ 7154.143235] [<ffffffff814717d1>] dump_stack+0x4d/0x6c
> [ 7154.188021] [<ffffffff81076e43>] __warn+0xd3/0xf0
> [ 7154.229193] [<ffffffff81076eaf>] warn_slowpath_fmt+0x4f/0x60
> [ 7154.278808] [<ffffffff81113048>] ? ftrace_set_filter_ip+0x68/0x80
> [ 7154.332076] [<ffffffff816f3ae0>] ? tcp_send_loss_probe+0x210/0x210
> [ 7154.386125] [<ffffffff81108cde>] disarm_kprobe+0xbe/0x100
> [ 7154.433392] [<ffffffff81108da5>] __disable_kprobe+0x85/0x90
> [ 7154.482215] [<ffffffff81108dd1>] disable_kprobe+0x21/0x40
> [ 7154.528793] [<ffffffff8113264d>] disable_trace_kprobe+0xad/0xd0
> [ 7154.580556] [<ffffffff8113325d>] kprobe_register+0x2d/0xf0
> [ 7154.628573] [<ffffffff8112cf33>] perf_trace_event_unreg.isra.1+0x33/0x90
> [ 7154.687413] [<ffffffff8112d276>] perf_trace_destroy+0x36/0x50
> [ 7154.738283] [<ffffffff81145649>] tp_perf_event_destroy+0x9/0x10
> [ 7154.789275] [<ffffffff8114b777>] _free_event+0xb7/0x260
> [ 7154.835085] [<ffffffff8114bda4>] put_event+0x14/0x20
> [ 7154.878600] [<ffffffff8114bfb8>] perf_event_release_kernel+0x208/0x280
> [ 7154.935650] [<ffffffff8114c040>] perf_release+0x10/0x20
> [ 7154.981433] [<ffffffff811d62c6>] __fput+0xd6/0x210
> [ 7155.023478] [<ffffffff811d643e>] ____fput+0xe/0x10
> [ 7155.065547] [<ffffffff81091597>] task_work_run+0x77/0x90
> [ 7155.112033] [<ffffffff810799eb>] do_exit+0x2bb/0xb30
> [ 7155.155608] [<ffffffff8107a2df>] do_group_exit+0x3f/0xa0
> [ 7155.202120] [<ffffffff81084405>] get_signal+0x1b5/0x5d0
> [ 7155.247965] [<ffffffff8101b1e8>] do_signal+0x28/0x690
> [ 7155.292207] [<ffffffff81216f40>] ? ep_ptable_queue_proc+0xa0/0xa0
> [ 7155.345539] [<ffffffff812175e5>] ? ep_scan_ready_list.isra.12+0x185/0x1a0
> [ 7155.404848] [<ffffffff81677c25>] ? __sys_sendmsg+0x65/0x80
> [ 7155.452898] [<ffffffff81072442>] exit_to_usermode_loop+0x53/0x96
> [ 7155.505497] [<ffffffff8100294b>] do_syscall_64+0x8b/0xa0
> [ 7155.552023] [<ffffffff8179333c>] entry_SYSCALL64_slow_path+0x25/0x25
> [ 7155.609271] ---[ end trace 77eec51ed01e77cb ]---
> [ 7155.671656] fuse init (API version 7.24)
>
> ftrace is complaining that dyn_ftrace->flags is set to FTRACE_FL_DISABLED
> which is expected, since 'fuse' module was just loaded
> and ftrace_update_code() set it, but at the time ftrace_shutdown()
> was called, the ftrace_module_enable() for 'fuse' was not called,
> hence ftrace_shutdown() was complaining.
>
> In other words the module.c is doing:
> ftrace_module_init(mod); // loading 'fuse'
> ... // and here ftrace_shutdown() is called
> err = prepare_coming_module(mod); // it didn't have a chance to clear FL_DISABLED
>
> do you think the following diff is enough to silence the warning
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index b1870fbd2b67..edb00bf70d84 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2744,7 +2744,7 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
> struct dyn_ftrace *rec;
>
> do_for_each_ftrace_rec(pg, rec) {
> - if (FTRACE_WARN_ON_ONCE(rec->flags))
> + if (FTRACE_WARN_ON_ONCE(rec->flags & ~FTRACE_FL_DISABLED))
> pr_warn(" %pS flags:%lx\n",
> (void *)rec->ip, rec->flags);
> } while_for_each_ftrace_rec();
>
> or it's a tip of an iceberg and less hacky fix is needed?

Actually, this may be correct. I need to audit the other
do_for_each_ftrace_rec() loops.

-- Steve