Re: perf_event hard locks 3.1.x

From: Vince Weaver
Date: Fri Jan 13 2012 - 17:43:42 EST



On Fri, 16 Dec 2011, Vince Weaver wrote:

> I had a PAPI user report that perf_event usage (such as running
> the PAPI tests) would cause hard lockups on his 3.1.x kernel
> (from ARCH linux).
>
> After some tedious bisection of the .config file, I found that the issue
> happens when
> CONFIG_SLUB=y
> CONFIG_SLUB_DEBUG=y
> is enabled. Having a kernel with that enabled and stressing the
> perf_event subsystem will eventually cause lockups or hard crashes.

I spent a lot of time trying to track this down, though the problem does
not appear with stock 3.2.

The problem is still there with 3.1.9, but since that might be the last
3.1.x kernel it might not matter anymore.

Summary of what I found:
you need to have CONFIG_SLUB=y
you can cause the crash by running the PAPI ctests 1-3 times
(the program that causes the crash is different each time, so
no good workaround. Probably a race condition).

I reverse-bisected the fix between 3.1 and 3.2 (that is to say, I
bisected to find when the kernels stopped crashing) to this commit:

commit a33caeb118198286309859f014c0662f3ed54ed4
lockdep, kmemcheck: Annotate ->lock in lockdep_init_map()

but that fix is already in 3.1.9 and doesn't seem to fix things.

It might also be the next commit

commit ddf6e0e50723b62ac76ed18eb53e9417c6eefba7
ftrace: Fix hash record accounting bug

as the panics in that report look similar to what I see before the
machine quickly and massively dies, but when I apply this on top of
3.1.9 it doesn't avoid the crash.

I suspect maybe I was chasing two separate problems, which is why the
bisect didn't give the proper results.

In any case, I'm abandoning the search for now unless anyone has some
other ideas I could try.

Thanks,

Vince

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/