Re: [PATCH] tracing: Have trace_marker writes be just half of TRACE_SEQ_SIZE
From: Steven Rostedt
Date: Mon Mar 04 2024 - 20:33:36 EST
On Mon, 4 Mar 2024 20:15:57 -0500
Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> On 2024-03-04 19:27, Steven Rostedt wrote:
> > From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx>
> >
> > Since the size of trace_seq's buffer is the max an event can output, have
> > the trace_marker be half of the entire TRACE_SEQ_SIZE, which is 4K. That
> > will keep writes that has meta data written from being dropped (but
> > reported), because the total output of the print event is greater than
> > what the trace_seq can hold.
>
> Defining the trace_mark limit in terms of "TRACE_SEQ_SIZE / 2"
> seems backwards. It's basically using a define of the maximum
> buffer size for the pretty-printing output and defining the maximum
> input size of a system call to half of that.
>
> I'd rather see, in a header file shared between tracing mark
> write implementation and output implementation:
>
> #define TRACING_MARK_MAX_SIZE 4096
>
> and then a static validation that this input fits within your
> pretty printing output in the output implementation file:
>
> BUILD_BUG_ON(TRACING_MARK_MAX_SIZE + sizeof(meta data stuff...) > TRACE_SEQ_SIZE);
That's not the meta size I'm worried about. The sizeof(meta data) is the
raw event binary data, which is not related to the size of the event output.
# cd /sys/kernel/tracing
# echo hello > trace_marker
# cat trace
[..]
<...>-999 [001] ..... 2296.140373: tracing_mark_write: hello
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is the meta data that is added to trace_seq
-- Steve
>
> This way we clearly document that the tracing mark write
> input limit is 4kB, rather than something derived from
> the size of an output buffer.
>
> Thanks,
>
> Mathieu
>
> >
> > Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
> > ---
> > kernel/trace/trace.c | 16 +++++++++++-----
> > 1 file changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> > index 8198bfc54b58..d68544aef65f 100644
> > --- a/kernel/trace/trace.c
> > +++ b/kernel/trace/trace.c
> > @@ -7320,6 +7320,17 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
> > if ((ssize_t)cnt < 0)
> > return -EINVAL;
> >
> > + /*
> > + * TRACE_SEQ_SIZE is the total size of trace_seq buffer used
> > + * for output. As the print event outputs more than just
> > + * the string written, keep it smaller than the trace_seq
> > + * as it could drop the event if the extra data makes it bigger
> > + * than what the trace_seq can hold. Half he TRACE_SEQ_SIZE
> > + * is more than enough.
> > + */
> > + if (cnt > TRACE_SEQ_SIZE / 2)
> > + cnt = TRACE_SEQ_SIZE / 2;
> > +
> > meta_size = sizeof(*entry) + 2; /* add '\0' and possible '\n' */
> > again:
> > size = cnt + meta_size;
> > @@ -7328,11 +7339,6 @@ tracing_mark_write(struct file *filp, const char __user *ubuf,
> > if (cnt < FAULTED_SIZE)
> > size += FAULTED_SIZE - cnt;
> >
> > - if (size > TRACE_SEQ_BUFFER_SIZE) {
> > - cnt -= size - TRACE_SEQ_BUFFER_SIZE;
> > - goto again;
> > - }
> > -
> > buffer = tr->array_buffer.buffer;
> > event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size,
> > tracing_gen_ctx());
>