Re: perf: fuzzer BUG: KASAN: stack-out-of-bounds in __unwind_start

From: Petr Mladek
Date: Tue Nov 29 2016 - 11:29:33 EST


On Tue 2016-11-29 07:10:04, Paul E. McKenney wrote:
> On Tue, Nov 29, 2016 at 01:43:23PM +0100, Peter Zijlstra wrote:
> > On Mon, Nov 28, 2016 at 11:52:41PM -0600, Josh Poimboeuf wrote:
> >
> > > Did a little digging on git blame and found the following commit (which
> > > seems to be the cause of the KASAN warning and missing stack dump):
> > >
> > > bc1dce514e9b ("rcu: Don't use NMIs to dump other CPUs' stacks")
> > >
> > > I presume this commit is still needed because of the NMI printk deadlock
> > > issues which were discussed at Kernel Summit. I guess those issues need
> > > to be sorted out before the above commit can be reverted.
> >
> > Also, I most always run with these here patches applied:
> >
> > https://lkml.kernel.org/r/20161018170830.405990950@xxxxxxxxxxxxx
> >
> > People are very busy polishing the turd we call printk, but from where
> > I'm sitting its terminally and unfixably broken.

I still hope that we could do better :-)


> > I should certainly add a revert of the above commit to the stack of
> > patches I carry.
>
> This isn't making me feel particularly confident about switching RCU
> CPU stall warnings back to NMIs... ;-)

IMHO, trigger_single_cpu_backtrace() is pretty safe at the moment.
It uses per-CPU buffers a lockless way in NMI context. It even makes
sure that the buffers are flushed to the main log buffer and console
once it is back from NMI.

By other words, the deadlocks in NMI context should be gone. The
NMI buffers are flushed using the classic printk(). Therefore
the risk is the same as when you use printk() directly
in rcu_dump_cpu_stacks() now.

Best Regards,
Petr