Re: perf: perf_fuzzer lockup in perf_cgroup_attach

From: Vince Weaver
Date: Thu Sep 15 2016 - 08:48:11 EST


On Wed, 14 Sep 2016, Stephane Eranian wrote:

> I would think there is a way to disable KASLR for this kind of testing!

yes, it's just I hadn't realized I had it enabled until I couldn't figure
out why addr2line wasn't working.

> Which of your fuzzer scripts are you using? fast_repro99.sh?

yes. I should probably give that script a more meaningful name, but by
now I think it's too late for that.

I also am running with paranoid set to "0"

>
> > The best I can tell things are getting wedged somehow in
> > perf_cgroup_switch() while interrupts are disabled. Interrupts are never
> > getting re-enabled, causing the RCU and NMI watchdogs to trigger (and more
> > alarming things like the SATA bus resetting).
> >
> How do you get to perf_cgroup_switch() from the traces you provide below?

It was my best guess after trying to trace through the code. It could
in theory be anywhere but it definitely seems like it is happening after
perf_cgroup_attach at some point when interrupts are disabled (possibly
ftrace too?)

It's difficult because some code gets inlined and then it's jumping to a
function pointer. And then mid-debug the system finally had enough and
started making threads hang forever while I was getting close.

I've rebooted the machine I'll see if I can replicate.

Vince