Re: [PATCH] ring-buffer: Have the buffer update counter be atomic
From: Steven Rostedt
Date: Fri Oct 11 2024 - 12:53:44 EST
On Fri, 11 Oct 2024 17:20:12 +0200
Petr Pavlu <petr.pavlu@xxxxxxxx> wrote:
> >From 359f5aa523bc27ca8600b4efae471eefc0514ce0 Mon Sep 17 00:00:00 2001
> From: Petr Pavlu <petr.pavlu@xxxxxxxx>
> Date: Tue, 16 Jul 2024 11:35:00 +0200
> Subject: [PATCH] ring-buffer: Fix reader locking when changing the sub buffer
> order
>
> The function ring_buffer_subbuf_order_set() updates each
> ring_buffer_per_cpu and installs new sub buffers that match the requested
> page order. This operation may be invoked concurrently with readers that
> rely on some of the modified data, such as the head bit (RB_PAGE_HEAD), or
> the ring_buffer_per_cpu.pages and reader_page pointers. However, no
> exclusive access is acquired by ring_buffer_subbuf_order_set(). Modyfing
> the mentioned data while a reader also operates on them can then result in
> incorrect memory access and various crashes.
>
> Fix the problem by taking the reader_lock when updating a specific
> ring_buffer_per_cpu in ring_buffer_subbuf_order_set().
>
> Fixes: 8e7b58c27b3c ("ring-buffer: Just update the subbuffers when changing their allocation order")
> Signed-off-by: Petr Pavlu <petr.pavlu@xxxxxxxx>
> ---
> kernel/trace/ring_buffer.c | 41 ++++++++++++++++++++++++--------------
> 1 file changed, 26 insertions(+), 15 deletions(-)
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 4c24191fa47d..d324f4f9ab9d 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -6773,36 +6773,38 @@ int ring_buffer_subbuf_order_set(struct trace_buffer *buffer, int order)
> }
>
> for_each_buffer_cpu(buffer, cpu) {
> + struct buffer_data_page *old_free_data_page;
> + struct list_head old_pages;
> + unsigned long flags;
>
> if (!cpumask_test_cpu(cpu, buffer->cpumask))
> continue;
>
> cpu_buffer = buffer->buffers[cpu];
>
> + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
> +
> /* Clear the head bit to make the link list normal to read */
> rb_head_page_deactivate(cpu_buffer);
>
> - /* Now walk the list and free all the old sub buffers */
> - list_for_each_entry_safe(bpage, tmp, cpu_buffer->pages, list) {
> - list_del_init(&bpage->list);
> - free_buffer_page(bpage);
> - }
> - /* The above loop stopped an the last page needing to be freed */
> - bpage = list_entry(cpu_buffer->pages, struct buffer_page, list);
> - free_buffer_page(bpage);
> -
> - /* Free the current reader page */
> - free_buffer_page(cpu_buffer->reader_page);
> + /*
> + * Collect buffers from the cpu_buffer pages list and the
> + * reader_page on old_pages, so they can be freed later when not
> + * under a spinlock. The pages list is a linked list with no
> + * head, adding old_pages turns it into a regular list with
> + * old_pages being the head.
> + */
> + list_add(&old_pages, cpu_buffer->pages);
> + list_add(&cpu_buffer->reader_page->list, &old_pages);
>
> /* One page was allocated for the reader page */
> cpu_buffer->reader_page = list_entry(cpu_buffer->new_pages.next,
> struct buffer_page, list);
> list_del_init(&cpu_buffer->reader_page->list);
>
> - /* The cpu_buffer pages are a link list with no head */
> + /* Install the new pages, remove the head from the list */
> cpu_buffer->pages = cpu_buffer->new_pages.next;
> - cpu_buffer->new_pages.next->prev = cpu_buffer->new_pages.prev;
> - cpu_buffer->new_pages.prev->next = cpu_buffer->new_pages.next;
> + list_del(&cpu_buffer->new_pages);
Heh, not sure why I didn't just use list_del() on the head itself instead
of open coding it. But anyway, it should be list_del_init(), as I will
likely remove the initialization of new_pages and have it always be
initialized. I rather have it clean than pointing to something unknown.
Care to send this as a real patch?
We can look at any other locking issues here separately, but this should be
fixed now.
-- Steve
> cpu_buffer->cnt++;
>
> /* Clear the new_pages list */
> @@ -6815,11 +6817,20 @@ int ring_buffer_subbuf_order_set(struct trace_buffer *buffer, int order)
> cpu_buffer->nr_pages = cpu_buffer->nr_pages_to_update;
> cpu_buffer->nr_pages_to_update = 0;
>
> - free_pages((unsigned long)cpu_buffer->free_page, old_order);
> + old_free_data_page = cpu_buffer->free_page;
> cpu_buffer->free_page = NULL;
>
> rb_head_page_activate(cpu_buffer);
>
> + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
> +
> + /* Free old sub buffers */
> + list_for_each_entry_safe(bpage, tmp, &old_pages, list) {
> + list_del_init(&bpage->list);
> + free_buffer_page(bpage);
> + }
> + free_pages((unsigned long)old_free_data_page, old_order);
> +
> rb_check_pages(cpu_buffer);
> }
>