Re: [PATCH v3] ring-buffer: Fix writing to the buffer with max_data_size
From: Google
Date: Tue Dec 12 2023 - 18:46:07 EST
On Tue, 12 Dec 2023 11:16:17 -0500
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> From: "Steven Rostedt (Google)" <rostedt@xxxxxxxxxxx>
>
> The maximum ring buffer data size is the maximum size of data that can be
> recorded on the ring buffer. Events must be smaller than the sub buffer
> data size minus any meta data. This size is checked before trying to
> allocate from the ring buffer because the allocation assumes that the size
> will fit on the sub buffer.
>
> The maximum size was calculated as the size of a sub buffer page (which is
> currently PAGE_SIZE minus the sub buffer header) minus the size of the
> meta data of an individual event. But it missed the possible adding of a
> time stamp for events that are added long enough apart that the event meta
> data can't hold the time delta.
>
> When an event is added that is greater than the current BUF_MAX_DATA_SIZE
> minus the size of a time stamp, but still less than or equal to
> BUF_MAX_DATA_SIZE, the ring buffer would go into an infinite loop, looking
> for a page that can hold the event. Luckily, there's a check for this loop
> and after 1000 iterations and a warning is emitted and the ring buffer is
> disabled. But this should never happen.
>
> This can happen when a large event is added first, or after a long period
> where an absolute timestamp is prefixed to the event, increasing its size
> by 8 bytes. This passes the check and then goes into the algorithm that
> causes the infinite loop.
>
> For events that are the first event on the sub-buffer, it does not need to
> add a timestamp, because the sub-buffer itself contains an absolute
> timestamp, and adding one is redundant.
>
> The fix is to check if the event is to be the first event on the
> sub-buffer, and if it is, then do not add a timestamp.
>
> This also fixes 32 bit adding a timestamp when a read of before_stamp or
> write_stamp is interrupted. There's still no need to add that timestamp if
> the event is going to be the first event on the sub buffer.
>
> Also, if the buffer has "time_stamp_abs" set, then also check if the
> length plus the timestamp is greater than the BUF_MAX_DATA_SIZE.
>
> Link: https://lore.kernel.org/all/20231212104549.58863438@xxxxxxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-trace-kernel/20231212071837.5fdd6c13@xxxxxxxxxxxxxxxxxx
>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: a4543a2fa9ef3 ("ring-buffer: Get timestamp after event is allocated")
> Fixes: 58fbc3c63275c ("ring-buffer: Consolidate add_timestamp to remove some branches")
> Reported-by: Kent Overstreet <kent.overstreet@xxxxxxxxx> # (on IRC)
> Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
This looks good to me :)
Acked-by: Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
Thank you!
> ---
> Changes since v2: https://lore.kernel.org/linux-trace-kernel/20231212065922.05f28041@xxxxxxxxxxxxxxxxxx
>
> - Just test 'w' first, and then do the rest of the checks.
>
> kernel/trace/ring_buffer.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 8d2a4f00eca9..b8986f82eccf 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -3579,7 +3579,10 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
> * absolute timestamp.
> * Don't bother if this is the start of a new page (w == 0).
> */
> - if (unlikely(!a_ok || !b_ok || (info->before != info->after && w))) {
> + if (!w) {
> + /* Use the sub-buffer timestamp */
> + info->delta = 0;
> + } else if (unlikely(!a_ok || !b_ok || info->before != info->after)) {
> info->add_timestamp |= RB_ADD_STAMP_FORCE | RB_ADD_STAMP_EXTEND;
> info->length += RB_LEN_TIME_EXTEND;
> } else {
> @@ -3737,6 +3740,8 @@ rb_reserve_next_event(struct trace_buffer *buffer,
> if (ring_buffer_time_stamp_abs(cpu_buffer->buffer)) {
> add_ts_default = RB_ADD_STAMP_ABSOLUTE;
> info.length += RB_LEN_TIME_EXTEND;
> + if (info.length > BUF_MAX_DATA_SIZE)
> + goto out_fail;
> } else {
> add_ts_default = RB_ADD_STAMP_NONE;
> }
> --
> 2.42.0
>
--
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>