[PATCHv2] perf powerpc: Don't call perf_event_disable from atomic context

From: Jiri Olsa
Date: Thu Oct 06 2016 - 08:33:13 EST


On Thu, Oct 06, 2016 at 09:24:15AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 05, 2016 at 09:53:38PM +0200, Jiri Olsa wrote:
> > On Wed, Oct 05, 2016 at 10:09:21AM +0200, Jiri Olsa wrote:
> > > On Tue, Oct 04, 2016 at 03:29:33PM +1100, Michael Ellerman wrote:
> > >
> > > SNIP
> > >
> > > > Which is where we cope with the possibility that we couldn't emulate the
> > > > instruction that hit the breakpoint. Seems that is not an issue on x86,
> > > > or it's handled elsewhere?
> > > >
> > > > We should fix emulate_step() if it failed to emulate something it
> > > > should have, but there will always be the possibility that it fails.
> > > >
> > > > Instead of calling perf_event_disable() we could just add a flag to
> > > > arch_hw_breakpoint that says we hit an error on the event, and block
> > > > reinstalling it in arch_install_hw_breakpoint().
> > >
> > > ok, might be easier.. I'll check on that
> >
> > so staring on that I think disabling is the right way here..
> >
> > we need the event to be unscheduled and not scheduled back
> > again, I don't see better way at the moment
>
> OK, can you resend the patch with updated Changelog that explains these
> things?

attached

thanks,
jirka


---
The trinity syscall fuzzer triggered following WARN on powerpc:
WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278
...
NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0
LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0
Call Trace:
[c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable)
[c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
[c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
[c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
[c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
[c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48

Followed by lockdep warning:
===============================
[ INFO: suspicious RCU usage. ]
4.8.0-rc5+ #7 Tainted: G W
-------------------------------
./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
2 locks held by ls/2998:
#0: (rcu_read_lock){......}, at: [<c0000000000f6a00>] .__atomic_notifier_call_chain+0x0/0x1c0
#1: (rcu_read_lock){......}, at: [<c00000000093ac50>] .hw_breakpoint_handler+0x0/0x2b0

stack backtrace:
CPU: 9 PID: 2998 Comm: ls Tainted: G W 4.8.0-rc5+ #7
Call Trace:
[c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable)
[c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180
[c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0
[c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0
[c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380
[c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60
[c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0
[c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0
[c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0
[c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0
[c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100
[c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48

While it looks like the first WARN is probably valid, the other one is
triggered by disabling event via perf_event_disable from atomic context.

The event is disabled here in case we were not able to emulate
the instruction that hit the breakpoint. By disabling the event
we unschedule the event and make sure it's not scheduled back.

But we can't call perf_event_disable from atomic context, instead
we need to use event's pending_disable irq_work way to disable it.

Adding new function for that:
perf_event_disable_inatomic(event)

Reported-by: Jan Stancek <jstancek@xxxxxxxxxx>
Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
---
arch/powerpc/kernel/hw_breakpoint.c | 2 +-
include/linux/perf_event.h | 1 +
kernel/events/core.c | 11 ++++++++---
3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index aec9a1b1d25b..4d3bcbbf626a 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -275,7 +275,7 @@ int __kprobes hw_breakpoint_handler(struct die_args *args)
if (!stepped) {
WARN(1, "Unable to handle hardware breakpoint. Breakpoint at "
"0x%lx will be disabled.", info->address);
- perf_event_disable(bp);
+ perf_event_disable_inatomic(bp);
goto out;
}
/*
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 5c5362584aba..c794fd84a595 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1248,6 +1248,7 @@ extern u64 perf_swevent_set_period(struct perf_event *event);
extern void perf_event_enable(struct perf_event *event);
extern void perf_event_disable(struct perf_event *event);
extern void perf_event_disable_local(struct perf_event *event);
+extern void perf_event_disable_inatomic(struct perf_event *event);
extern void perf_event_task_tick(void);
#else /* !CONFIG_PERF_EVENTS: */
static inline void *
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7c0d263f6bc5..3d650ccf4def 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1960,6 +1960,13 @@ void perf_event_disable(struct perf_event *event)
}
EXPORT_SYMBOL_GPL(perf_event_disable);

+void perf_event_disable_inatomic(struct perf_event *event)
+{
+ event->pending_kill = POLL_HUP;
+ event->pending_disable = 1;
+ irq_work_queue(&event->pending);
+}
+
static void perf_set_shadow_time(struct perf_event *event,
struct perf_event_context *ctx,
u64 tstamp)
@@ -7074,9 +7081,7 @@ static int __perf_event_overflow(struct perf_event *event,
event->pending_kill = POLL_IN;
if (events && atomic_dec_and_test(&event->event_limit)) {
ret = 1;
- event->pending_kill = POLL_HUP;
- event->pending_disable = 1;
- irq_work_queue(&event->pending);
+ perf_event_disable_inatomic(event);
}

event->overflow_handler(event, data, regs);
--
2.7.4