UI messages in event thread hangs perf top
From: David Miller
Date: Sun Oct 28 2018 - 00:11:07 EST
If I run perf top with a "make -j128" kernel build, I get ring buffer event
processing timeouts which results in:
ui__warning("Too slow to read ring buffer.\n"
"Please try increasing the period (-c) or\n"
"decreasing the freq (-F) or\n"
"limiting the number of CPUs (-C)\n");
from perf_top__mmap_read().
This hangs the main event thread. Only the display thread runs after
this point.
We can't issue UI messages from the event thread, because those will
hang waiting for a keypress. The display thread will eat any keys
we press and the event thread thus hangs forever.
I can tell this is what has happened because the histogram entries
continue to decay, yet the event count stops increasing.
If I put a gdb on the perf process, indeed the backtrace in the event
processing thread is in the select() call done by ui__getch().
Adding insult to injury, the display thread immediately overwrites the
warning message printed by the event thread, and thus the user has no
chance to even see it.
I really wonder how this was tested.
Perhaps we should mark the event thread in a special way and trigger
assertions if UI messages are printed from it. Again, any such
operation will hang the thread and stop all event processing.