Re: perf: 3.17 another perf_fuzzer lockup
From: Vince Weaver
Date: Wed Oct 15 2014 - 14:27:44 EST
OK, so it turns out that the oops I saw with memory corruption wasn't the
bug I was tracking, but something that comes up sometimes when trying to
run ftrace at the same time as fuzzing. So we'll leave that for another
day.
The 3.17+ lockup I am tracking still reproduces as of git from yesterday
(even after the 3.18-rc perf_event merges).
I can use sysrq to get the stack trace, the one CPU is stuck in a call
to find_get_context().
An example backtrace:
[88200.300003] <EOI>
[88200.300003] [<ffffffff81114869>] ? ____cache_alloc+0x130/0x25b
[88200.300003] [<ffffffff8107fb05>] ? __call_rcu.constprop.63+0x1bf/0x1cb
[88200.300003] [<ffffffff8107fb2b>] kfree_call_rcu+0x1a/0x1c
[88200.300003] [<ffffffff810cf84f>] put_ctx+0x51/0x55
[88200.300003] [<ffffffff810d1840>] find_get_context+0x166/0x195
[88200.300003] [<ffffffff810d5856>] SYSC_perf_event_open+0x47b/0x7f5
[88200.300003] [<ffffffff810d5f55>] SyS_perf_event_open+0xe/0x10
[88200.300003] [<ffffffff815362d6>] system_call_fastpath+0x16/0x1b
It looks like the
else if (task->perf_event_ctxp[ctxn])
err = -EAGAIN;
case is triggering non-stop in the retry path of
find_get_context() and so the kernel gets stuck forever retrying.
I can drop some printks in if it will help debug. I've tried running
ftrace, but for whatever reason if I enable ftrace the bug won't trigger.
Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/