Re: [PATCH v2 1/4] arm64: kprobes: Recover pstate.D in single-step exception handler

From: James Morse
Date: Tue Jul 23 2019 - 12:03:56 EST


Hi!

On 22/07/2019 08:48, Masami Hiramatsu wrote:
> On arm64, if a nested kprobes hit, it can crash the kernel with below
> error message.
>
> [ 152.118921] Unexpected kernel single-step exception at EL1
>
> This is because commit 7419333fa15e ("arm64: kprobe: Always clear
> pstate.D in breakpoint exception handler") unmask pstate.D for
> doing single step but does not recover it after single step in
> the nested kprobes.

> That is correct *unless* any nested kprobes
> (single-stepping) runs inside other kprobes user handler.

(I don't think this is correct, its just usually invisible as PSTATE.D is normally clear)


> When the 1st kprobe hits, do_debug_exception() will be called. At this
> point, debug exception (= pstate.D) must be masked (=1). When the 2nd
> (nested) kprobe is hit before single-step of the first kprobe, it
> unmask debug exception (pstate.D = 0) and return.
> Then, when the 1st kprobe setting up single-step, it saves current
> DAIF, mask DAIF, enable single-step, and restore DAIF.
> However, since "D" flag in DAIF is cleared by the 2nd kprobe, the
> single-step exception happens soon after restoring DAIF.

This is pretty complicated. Just to check I've understood this properly:
Stepping on a kprobe in a kprobe-user's pre_handler will cause the remainder of the
handler (the first one) to run with PSTATE.D clear. Once we enable single-step, we start
stepping the debug handler, and will never step the original kprobe'd instruction.

This is describing the most complicated way that this problem shows up! (I agree its also
the worst)

I can get this to show up with just one kprobe. (function/file names here are meaningless):

| static int wibble(struct seq_file *m, void *discard)
| {
| unsigned long d, flags;
|
| flags = local_daif_save();
|
| kprobe_me();
| d = read_sysreg(daif);
| local_daif_restore(flags);
|
| seq_printf(m, "%lx\n", d);
|
| return 0;
| }

plumbed into debugfs, then kicked using the kprobe_example module:
| root@adam:/sys/kernel/debug# cat wibble
| 3c0

| root@adam:/sys/kernel/debug# insmod ~morse/kprobe_example.ko symbol=kprobe_me
| [ 69.478098] Planted kprobe at [..]
| root@adam:/sys/kernel/debug# cat wibble
| [ 71.478935] <kprobe_me> pre_handler: p->addr = [..], pc = [..], pstate = 0x600003c5
| [ 71.488942] <kprobe_me> post_handler: p->addr = [..], pstate = 0x600001c5
| 1c0

| root@adam:/sys/kernel/debug#

This is problem for any code that had debug masked, not just kprobes.

Can we start the commit-message with the simplest description of the problem: kprobes
manipulates the interrupted PSTATE for single step, and doesn't restore it.

(trying to understand this bug through kprobe's interaction with itself is hard!)


> To solve this issue, this stores all DAIF bits and restore it
> after single stepping.


> diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
> index bd5dfffca272..348e02b799a2 100644
> --- a/arch/arm64/kernel/probes/kprobes.c
> +++ b/arch/arm64/kernel/probes/kprobes.c
> @@ -29,6 +29,8 @@
>
> #include "decode-insn.h"
>
> +#define PSR_DAIF_MASK (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)

We should probably move this to daifflags.h. Its going to be useful to other series too.


Patch looks good!
Reviewed-by: James Morse <james.morse@xxxxxxx>
Tested-by: James Morse <james.morse@xxxxxxx>

(I haven't tried to test the nested kprobes case...)


Thanks,

James