Re: [PATCH RFC] perf_counter: Don't swap contexts containinglocked mutex

From: Ingo Molnar
Date: Fri May 29 2009 - 08:35:29 EST



* Ingo Molnar <mingo@xxxxxxx> wrote:

> try the latest Git repo (i tried 95110d7) and do this:
>
> make clean
> perf stat -- make -j
>
> that locks up for me, very quickly, with permanently stuck tasks:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME COMMAND
> 10748 mingo 20 0 0 0 0 R 100.4 0.0 0:06.44 chmod
> 10756 mingo 20 0 0 0 0 R 100.4 0.0 0:06.43 touch
>
> looping in the remove-context retry loop.

ok, after muchos debugging and tracing this turned out to be the
perf_counter_task_exit() in kernel/fork.c, in the fork() failure
path. That zapped the task ctx in cpuctx and caused the next
schedule (which is rare) to not schedule the real context out. Then,
when the task was scheduled back in again later, we scheduled in
already active counters. Much mayhem followed and the lockup was a
common incarnation of that. I pushed out a couple of fixes for this.

Pekka, the symptoms appear to match your 'stuck Xorg while make -j'
symptoms pretty accurately - so if you try latest perfcounters/core
it might solve some of those problems as well.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/