[PATCH/RFC] tracing: make trace_marker{,_raw} stream-like

From: John Keeping
Date: Thu Sep 09 2021 - 10:10:08 EST


The tracing marker files are write-only streams with no meaningful
concept of file position. Using stream_open() to mark them as
stream-link indicates this and has the added advantage that a single
file descriptor can now be used from multiple threads without contention
thanks to clearing FMODE_ATOMIC_POS.

Note that this has the potential to break existing userspace by since
both lseek(2) and pwrite(2) will now return ESPIPE when previously lseek
would have updated the stored offset and pwrite would have appended to
the trace. A survey of libtracefs and several other projects found to
use trace_marker(_raw) [1][2][3] suggests that everyone limits
themselves to calling write(2) and close(2) on these file descriptors so
there is a good chance this will go unnoticed and the benefits of
reduced overhead and lock contention seem worth the risk.

[1] https://github.com/google/perfetto
[2] https://github.com/intel/media-driver/
[3] https://w1.fi/cgit/hostap/

Signed-off-by: John Keeping <john@xxxxxxxxxxxx>
---
I'm marking this as RFC because it breaks the Prime Directive of Linux
development, as explained above. But I'm hoping this is one of the
cases where we get away with it because this is a huge improvement for
multi-threaded programs when doing the simple thing of opening a single
trace_marker FD at startup and just writing to it from any thread.

After writing this, I realised that an alternative solution to my
original problem would have been to use pwrite instead of write! But I
still think fixing this at the source would be better.

kernel/trace/trace.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 2dbf797aa133..251679846a1b 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4834,6 +4834,12 @@ int tracing_open_generic_tr(struct inode *inode, struct file *filp)
return 0;
}

+static int tracing_mark_open(struct inode *inode, struct file *filp)
+{
+ stream_open(inode, filp);
+ return tracing_open_generic_tr(inode, filp);
+}
+
static int tracing_release(struct inode *inode, struct file *file)
{
struct trace_array *tr = inode->i_private;
@@ -7101,9 +7107,6 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
if (tt)
event_triggers_post_call(tr->trace_marker_file, tt);

- if (written > 0)
- *fpos += written;
-
return written;
}

@@ -7162,9 +7165,6 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,

__buffer_unlock_commit(buffer, event);

- if (written > 0)
- *fpos += written;
-
return written;
}

@@ -7564,16 +7564,14 @@ static const struct file_operations tracing_free_buffer_fops = {
};

static const struct file_operations tracing_mark_fops = {
- .open = tracing_open_generic_tr,
+ .open = tracing_mark_open,
.write = tracing_mark_write,
- .llseek = generic_file_llseek,
.release = tracing_release_generic_tr,
};

static const struct file_operations tracing_mark_raw_fops = {
- .open = tracing_open_generic_tr,
+ .open = tracing_mark_open,
.write = tracing_mark_raw_write,
- .llseek = generic_file_llseek,
.release = tracing_release_generic_tr,
};

--
2.33.0