Re: [PATCH 2/2] arm64: Fix watchpoint recursion when single-step is wrongly triggered in irq

From: Pratyush Anand
Date: Mon Mar 21 2016 - 07:06:09 EST

On 21/03/2016:06:38:31 PM, Wangnan (F) wrote:
> On 2016/3/21 18:24, Pratyush Anand wrote:
> >On 21/03/2016:08:37:50 AM, He Kuang wrote:
> >>On arm64, watchpoint handler enables single-step to bypass the next
> >>instruction for not recursive enter. If an irq is triggered right
> >>after the watchpoint, a single-step will be wrongly triggered in irq
> >>handler, which causes the watchpoint address not stepped over and
> >>system hang.
> >Does patch [1] resolves this issue as well? I hope it should. Patch[1] has still
> >not been sent for review. Your test result will be helpful.
> >
> >~Pratyush
> >
> >[1]
> Could you please provide a test program for your case so we can test
> it on our devices? I guess setting breakpoint on a "copy_from_user()"
> accessing an invalid address can trigger this problem?

My test case was to test kprobing of copy_from_user. I used kprobe64-v11.

I reverted "patch v11 3/9" and used following script for __copy_to_user(),
which instruments kprobe at every instruction of a given function. I can easily
see "Unexpected kernel single-step exception at EL1".
#! /bin/sh
#$1: function name
echo 0 > /sys/kernel/debug/tracing/events/kprobes/enable
echo > /sys/kernel/debug/tracing/trace
echo > /sys/kernel/debug/tracing/kprobe_events
func=$(cat /proc/kallsyms | grep -A 1 -w $1 | cut -d ' ' -f 1)
func_start=$((0x$(echo $func | cut -d ' ' -f 1)))
func_end=$((0x$(echo $func | cut -d ' ' -f 2)))
while [ $(($func_start + $offset)) -lt $func_end ]
printf -v cmd "p:probe_%x $1+0x%x" $offset $offset
echo $cmd >> /sys/kernel/debug/tracing/kprobe_events
offset=$((offset + 4))
echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable

# ./ __copy_to_user

Now, if I apply the patch which I referred in [1], I can no longer see any
"Unexpected kernel single-step exception at EL1" with above test script.

If I understood correctly, then the problem you described in your patch is that
an irq (el1_irq) is raised when watchpoint was being handled by kernel(specially
before kernel could call reinstall_suspended_bps() to disable single stepping).
Since, I disable single stepping for all the el1 exception mode, if
kernel_enable_single_step() had been called but kernel_disable_single_step() had
n't been called. So, your test case could be another good test for my