Re: [PATCH v6 0/6] arm64: Add kernel probes (kprobes) support

From: William Cohen
Date: Mon Apr 27 2015 - 22:58:29 EST



Hi All,

I have been experimenting with the patches for arm64 kprobes support.
On occasion the kernel gets stuck in a loop printing output:

Unexpected kernel single-step exception at EL1

This message by itself is not that enlighten. I added the attached
patch to get some additional information about register state when the
warning is printed out. Below is an example output:


[14613.263536] Unexpected kernel single-step exception at EL1
[14613.269001] kcb->ss_ctx.ss_status = 1
[14613.272643] kcb->ss_ctx.match_addr = fffffdfffc001250 0xfffffdfffc001250
[14613.279324] instruction_pointer(regs) = fffffe0000093358 el1_da+0x8/0x70
[14613.286003]
[14613.287487] CPU: 3 PID: 621 Comm: irqbalance Tainted: G OE 4.0.0u4+ #6
[14613.295019] Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0-rh-0.15 Mar 13 2015
[14613.302982] task: fffffe01d6806780 ti: fffffe01d68ac000 task.ti: fffffe01d68ac000
[14613.310430] PC is at el1_da+0x8/0x70
[14613.313990] LR is at trampoline_probe_handler+0x188/0x1ec
[14613.319363] pc : [<fffffe0000093358>] lr : [<fffffe0000687590>] pstate: 600001c5
[14613.326724] sp : fffffe01d68af640
[14613.330021] x29: fffffe01d68afbf0 x28: fffffe01d68ac000
[14613.335328] x27: fffffe00000939cc x26: fffffe0000bb09d0
[14613.340634] x25: fffffe01d68afdb0 x24: 0000000000000025
[14613.345939] x23: 00000000800003c5 x22: fffffdfffc001284
[14613.351245] x21: fffffe01d68af760 x20: fffffe01d7c79a00
[14613.356552] x19: 0000000000000000 x18: 000003ffa4b8e600
[14613.361858] x17: 000003ffa5480698 x16: fffffe00001f2afc
[14613.367164] x15: 0000000000000007 x14: 000003ffeffa8690
[14613.372471] x13: 0000000000000001 x12: 000003ffa4baf200
[14613.377778] x11: fffffe00006bb328 x10: fffffe00006bb32c
[14613.383084] x9 : fffffe01d68afd10 x8 : fffffe01d6806d10
[14613.388390] x7 : fffffe01ffd01298 x6 : fffffe000009192c
[14613.393696] x5 : fffffe0000c1b398 x4 : 0000000000000000
[14613.399001] x3 : 0000000000200200 x2 : 0000000000100100
[14613.404306] x1 : 0000000096000006 x0 : 0000000000000015
[14613.409610]
[14613.411094] BUG: failure at arch/arm64/kernel/debug-monitors.c:276/single_step_handler()!


The really odd thing is the address of the PC it is in el1_da the code
to handle data aborts. it looks like it is getting the unexpected
single_step exception right after the enable_debug in el1_da. I think
what might be happening is:

-an instruction is instrumented with kprobe
-the instruction is copied to a buffer
-a breakpoint replaces the instruction
-the kprobe fires when the breakpoint is encountered
-the instruction in the buffer is set to single step
-a single step of the instruction is attempted
-a data abort exception is raised
-el1_da is called
-el1_da does an enable_dbg to unmask the debug exceptions
-single_step_handler is called
-single_step_handler doesn't find anything to handle that pc
-single_step_handler prints the warning about unexpected el1 single step
-single_step_handler re-enable ss step
-the single step of the instruction is attempted endlessly

It looks like commit 1059c6bf8534acda249e7e65c81e7696fb074dc1 from Mon
Sep 22 "arm64: debug: don't re-enable debug exceptions on return from el1_dbg"
was trying to address a similar problem for the el1_dbg
function. Should el1_da and other el1_* functions have the enable_dbg
removed?

If single_step_handler doesn't find a handler, is re-enabling the
single step with set_regs_spsr_ss in single_step_handler the right thing to do?

-Will

diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index dae7bb4..ec5a1b2 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -262,6 +262,19 @@ static int single_step_handler(unsigned long addr, unsigned int esr,

if (!handler_found) {
pr_warning("Unexpected kernel single-step exception at EL1\n");
+ {
+ struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+ pr_warning("kcb->ss_ctx.ss_status = %ld\n",
+ kcb->ss_ctx.ss_status);
+ printk("kcb->ss_ctx.match_addr = %lx ",
+ kcb->ss_ctx.match_addr);
+ print_symbol("%s\n", kcb->ss_ctx.match_addr);
+ printk("instruction_pointer(regs) = %lx ",
+ instruction_pointer(regs));
+ print_symbol("%s\n", instruction_pointer(regs));
+ show_regs(regs);
+ BUG();
+ }
/*
* Re-enable stepping since we know that we will be
* returning to regs.