Re: perf: fuzzer KASAN unwind_get_return_address

From: Peter Zijlstra
Date: Thu Nov 17 2016 - 12:13:46 EST


On Thu, Nov 17, 2016 at 09:18:48AM -0600, Josh Poimboeuf wrote:
> On Thu, Nov 17, 2016 at 10:04:46AM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 16, 2016 at 10:48:28PM -0600, Josh Poimboeuf wrote:
> > > Peter or Vince, can you try to recreate with this patch? It dumps the
> > > raw stack contents during a stack dump. Hopefully that would give a
> > > clue about what's going wrong.
> >
> >
> > Here goes... I'll do another run and get you the results of that as
> > well.
>
> Thanks, I just waded through this and it turned up some good clues. And
> according to 'git blame', you might be able to help :-)
>
> It's not stack corruption. Instead it looks like
> __intel_pmu_pebs_event() is creating a bad or stale pt_regs which gets
> passed to the unwinder. Specifically, regs->bp points to a seemingly
> random address on the NMI stack. Which seems odd, considering the code
> itself is running on the same NMI stack.
>
> I don't know much about the PEBS code but it seems like it's passing
> some stale data. Either that or there's some NMI nesting going on.

Ooh, indeed. The PEBS record can be quite stale by the time we get to
the interrupt. Using those registers for an unwind is 'interesting' at
best.

Esp. with the multi-pebs stuff that's landed this can be very very
stale, but even single pebs can have a radically different stack at
interrupt time than we had at record time -- imagine a (i)ret happening
in between.

Let me consider that code, and what to do about this; its been a while
since I went over all that.