[PATCH][manpages 2/2] perf_event_open.2: Document write_backward

From: Wang Nan
Date: Mon Mar 28 2016 - 06:17:18 EST


Signed-off-by: Wang Nan <wangnan0@xxxxxxxxxx>
---
man2/perf_event_open.2 | 57 ++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 55 insertions(+), 2 deletions(-)

diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index b232cba..942a410 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -234,8 +234,10 @@ struct perf_event_attr {
mmap2 : 1, /* include mmap with inode data */
comm_exec : 1, /* flag comm events that are due to exec */
use_clockid : 1, /* use clockid for time fields */
+ context_switch : 1, /* context switch data */
+ write_backward : 1, /* Write ring buffer from end to beginning */

- __reserved_1 : 38;
+ __reserved_1 : 36;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -1105,6 +1107,30 @@ field.
This can make it easier to correlate perf sample times with
timestamps generated by other tools.
.TP
+.IR "write_backward" " (since Linux 4.6)"
+.\" commit ? (http://lkml.kernel.org/g/1459147292-239310-5-git-send-email-wangnan0@xxxxxxxxxx)
+This makes the resuling event use a backward ring-buffer, which
+writes samples from the end of the ring-buffer.
+
+It is not allowed to connect events with backward and forward
+ring-buffer settings together using
+.B PERF_EVENT_IOC_SET_OUTPUT.
+
+Backward ring-buffer is useful when the ring-buffer is overwritable
+(created by readonly
+.BR mmap (2)
+). In this case,
+.IR data_tail
+is useless,
+.IR data_head
+points to the head of the most recent sample in a backward
+ring-buffer. It is easy to iterate over the whole ring-buffer by reading
+samples one by one because size of a sample can be found from decoding
+its header. In contract, in a forward overwritable ring-buffer, the only
+information is the end of the most recent sample which is pointed by
+.IR data_head,
+but the size of a sample can't be determined from the end of it.
+.TP
.IR "wakeup_events" ", " "wakeup_watermark"
This union sets how many samples
.RI ( wakeup_events )
@@ -1634,7 +1660,9 @@ And vice versa:
.TP
.I data_head
This points to the head of the data section.
-The value continuously increases, it does not wrap.
+The value continuously increases (or decrease if
+.IR write_backward
+is set), it does not wrap.
The value needs to be manually wrapped by the size of the mmap buffer
before accessing the samples.

@@ -2581,6 +2609,24 @@ Starting with Linux 3.18,
.B POLL_HUP
is indicated if the event being monitored is attached to a different
process and that process exits.
+.SS Reading from overwritable ring-buffer
+Reader is unable to update
+.IR data_tail
+if the mapping is not
+.BR PROT_WRITE .
+In this case, kernel will overwrite data without considering whether
+they are read or not, so ring-buffer is overwritable and
+behaves like a flight recorder. To read from an overwritable
+ring-buffer, setting
+.IR write_backward
+is suggested, or it would be hard to find a proper position to start
+decoding. In addition, ring-buffer should be paused before reading
+through
+.BR ioctl (2)
+with
+.B PERF_EVENT_IOC_PAUSE_OUTPUT
+to avoid racing between kernel and reader. Ring-buffer should be resumed
+after finish reading.
.SS rdpmc instruction
Starting with Linux 3.4 on x86, you can use the
.\" commit c7206205d00ab375839bd6c7ddb247d600693c09
@@ -2693,6 +2739,13 @@ The file descriptors must all be on the same CPU.

The argument specifies the desired file descriptor, or \-1 if
output should be ignored.
+
+Two events with different
+.IR write_backward
+settings are not allowed to be connected together using
+.B PERF_EVENT_IOC_SET_OUTPUT.
+.B EINVAL
+is returned in this case.
.TP
.BR PERF_EVENT_IOC_SET_FILTER " (since Linux 2.6.33)"
.\" commit 6fb2915df7f0747d9044da9dbff5b46dc2e20830
--
1.8.3.4