perf_counters issue with self-sampling threads

From: stephane eranian
Date: Mon Jul 27 2009 - 12:51:18 EST


Hi,

I believe there is a problem with the current perf_counters (PCL)
code forÂself-sampling threads. The problem is related to sample
notifications via signal.

PCL (just like perfmon) is using SIGIO, an asynchronous signal,
to notify user applications of the availability of data in the event
buffer.

POSIX does not mandate that asynchronous signals be delivered
to the thread in which they originated. Any thread in the process
may process the signal, assuming it does not have the signal
blocked.

This is a serious problem with any self-sampling program such as
those built on top of PAPI. When sampling, you do want the signal
to be delivered to the thread in which the counter overflowed. This is
not just for convenience but it is required if the signal handler needs
to operate on the thread's machine state. Although, there is always
a possibility of forwarding the signal via tkill() to the right thread, I
do not think this is the right solution as it incurs additional latency
and therefore skid.

Looking at the kernel source code related to that, it seems that
kill_fasync() ends up calling group_send_sig_info(). This function
adds the signal to the process SHARED sigpending queue. Then,
it picks a thread to "wakeup". It first tries the thread in which the
signal originated with the following selection test (wants_signal):
- signal is not blocked
- thread is not exiting
- no signal private pending for this thread

If that does not work, it iterates over the other threads of the process.

This explains why in trivial tests, the SIGIO is always delivered
to the right thread. However it the monitored thread is using any
other signals, e.g., SIGALRM, then the SIGIO signal can go to the
wrong thread. The problem also arises if the first SIGIO is not already
processed by the time a 2nd is pended.

For self-sampling, we want (and in fact require) asynchronous notifications.
But we want the extra guarantee that the signal is ALWAYS delivered to
the thread in which the event occurred.

It seems like we could either create a different version of kill_fasync() or
pass an extra parameter to force this function to use specific_send_sig_info().
This would be only when self-monitoring. When a tool is monitoring another
thread, it is probably okay to have the signal delivered to any threads. Most
likely, the tool is setup such that threads not processing notifications have
the signal blocked.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/