Re: WARNING at arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270

From: Masami Hiramatsu
Date: Tue May 16 2017 - 11:17:01 EST


On Tue, 16 May 2017 09:48:02 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> Hi Masami,
>
> Thomas hit the following bug:
>
> [ 169.795937] trace_kprobe: Testing kprobe tracing: OK
> [ 169.833031] rtc_cmos 00:00: setting system clock to 2017-05-16 13:19:26 UTC (1494940766)
> [ 169.833986] PM: Hibernation image not present or could not be loaded.
> [ 169.836222] Freeing unused kernel memory: 1376K
> [ 169.836793] ------------[ cut here ]------------
> [ 169.836802] WARNING: CPU: 1 PID: 240 at /home/tglx/work/kernel/linus/linux/arch/x86/kernel/alternative.c:707 text_poke+0x25d/0x270
> [ 169.836803] Modules linked in:
> [ 169.836806] CPU: 1 PID: 240 Comm: kworker/1:1 Not tainted 4.12.0-rc1+ #138
> [ 169.836808] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 169.836812] Workqueue: events kprobe_optimizer
> [ 169.836819] task: ffff8802344e28c0 task.stack: ffffc90001cac000
> [ 169.836821] RIP: 0010:text_poke+0x25d/0x270
> [ 169.836822] RSP: 0000:ffffc90001cafd38 EFLAGS: 00010246
> [ 169.836825] RAX: 0100000000000000 RBX: ffffffff81d9daeb RCX: ffffffffa0008000
> [ 169.836826] RDX: 0000000000000000 RSI: ffffc90001cafd87 RDI: ffffffff81d9daeb
> [ 169.836827] RBP: ffffc90001cafd70 R08: 0000000000000001 R09: 0000000000000000
> [ 169.836828] R10: ffffc90001cafe08 R11: 0000000000000008 R12: 0000000000000001
> [ 169.836829] R13: ffffc90001cafd87 R14: ffffffff81d9daec R15: ffffea0000067a58
> [ 169.836831] FS: 0000000000000000(0000) GS:ffff88023ec40000(0000) knlGS:0000000000000000
> [ 169.836832] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 169.836833] CR2: ffffc9000146c000 CR3: 0000000001c0f000 CR4: 00000000003406e0
> [ 169.836837] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 169.836838] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 169.836839] Call Trace:
> [ 169.836845] ? init_kprobe_trace+0x96/0x96
> [ 169.836848] ? kprobe_trace_selftest_target+0x1/0x13
> [ 169.836851] text_poke_bp+0x57/0xf0
> [ 169.836857] arch_unoptimize_kprobe+0x42/0x60
> [ 169.836860] arch_unoptimize_kprobes+0x35/0x90
> [ 169.836865] kprobe_optimizer+0x63/0x2a0
> [ 169.836870] process_one_work+0x1ed/0x630
> [ 169.836877] worker_thread+0x69/0x3c0
> [ 169.836884] kthread+0x138/0x170
> [ 169.836886] ? process_one_work+0x630/0x630
> [ 169.836888] ? kthread_create_on_node+0x70/0x70
> [ 169.836893] ret_from_fork+0x31/0x40
> [ 169.836903] Code: e8 f9 5d 35 00 31 d2 31 f6 bf 86 02 00 00 ff 15 2a fc a2 00 e9 3d ff ff ff 48 8b 7d c8 57 9d 0f 1f 44 00 00 e8 25 95 09 00 eb ad <0f> ff e9 12 fe ff ff 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
> [ 169.836955] ---[ end trace 3d393a48ed109eb0 ]---
> [ 169.837103] ------------[ cut here ]------------
>
> It appears that the kprobe_optimizer work thread call happened after
> the init pages were freed, causing alternative.c to give the above
> warning because the text that is being unoptimized happens to no longer
> exist.

Ah, I see. I need to check that case. Actually for the module
init text area, kill_kprobe() correctly kicks kill_optimized_kprobe()
so it should safe. But above case is on the init-text in kernel
itself. I guess module_notifier may not be called for that area...

>
> This doesn't always happen, but it may have been something that's been
> there for a while.

I see, since it depends on the timing when the thread is kicked.

> Can you take a look at it, and perhaps find a way to
> flush the work queue before exiting the kprobe test?

Yes, I must handle it.

Thank you!

>
> Thanks!
>
> -- Steve


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>