Re: hw_breakpoint: Fix Oops at destroying hw_breakpoint event on powerpc

From: Peter Zijlstra
Date: Thu Mar 03 2016 - 05:20:22 EST


On Thu, Mar 03, 2016 at 08:23:38PM +1100, Michael Ellerman wrote:
> On Wed, 2016-03-02 at 12:59 +0100, Peter Zijlstra wrote:
>
> > On Wed, Mar 02, 2016 at 10:53:24PM +1100, Michael Ellerman wrote:
>
> > > Peterz, acme, do you guys want to take this? Or should I?
> >
> > I'm not too happy its touching event->ctx at all. It really should not
> > be doing that.
>
> Hmm OK.
>
> It's been using ctx->task since it was merged in 2010. In fact that commit also
> added arch_unregister_hw_breakpoint(), and we're still the only user of that.

Yes, I saw that.

> The prima facie reason it's using ctx is to get at task->thread to clear
> last_hit_ubp.

Indeed, but if there's a preemption point in between setting and using
that state, the ctx->task pointer might not actually still point to the
same task. With inherited events the event might get swapped to the next
task if it has the exact same (inherited) event configuration instead of
reprogramming the hardware.

And if there's no preemption point in between then:

> It looks like other arches avoid needing to do something similar by storing the
> break point in a per-cpu array. Which I guess is what you meant in your other
> mail ("Why do you keep per task state anyway?").

this seems possible. And indeed, that was part of that question. The
other part was wondering how per-cpu breakpoints were treated, although
for those the per-cpu storage thing is an 'obvious' solution.

> I can't think of a reason why we can't also store it per-cpu, but I could be
> wrong, I don't know the code well and I haven't thought about it for very long.

Right, so I'm not really up to snuff on the whole hw_breakpoint stuff
either, that was bolted onto perf by mingo, fweisbec, kprasad and others
while I was doing PMU bits, and I've never really dug into it.

I understand kprasad is no longer with IBM, his email bounced. That's a
shame because he knew about this stuff.. :/

> Do you mind if I merge the following fix for now as a band-aid, and we'll try
> and fix it up properly in the next few weeks (but maybe not in time for 4.5
> final).

OK, that works for me.

Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>