Re: lockup in rb_get_reader_page

From: Jiaying Zhang
Date: Tue Apr 06 2010 - 17:12:58 EST

Hi Steven,

On Sat, Apr 3, 2010 at 5:45 AM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> On Fri, 2010-04-02 at 16:10 -0700, Jiaying Zhang wrote:
>> The page offset is the index we added in the buffer_page structure.
>> You can ignore this field. The interesting part here is that both
>> cpu_buffer->head_page and cpu_buffer->reader_page point to the
>> same buffer_page. I am not sure yet how we entered this situation,
> You can ignore the cpu_buffer->head_page, it is used as a reference and
> is not part of the main algorithm. It is just there to tell the reader
> where the last head page was.
>> but the problem is once we get here, we will be in an infinite loop.
> But yes, it should never point to the reader page, because the reader
> controls the head_page __and__ the reader page.
>> At the beginning of the spin loop, we call rb_set_head_page() to grab
>> the head_page. In that function, we check whether a page is the head_page
>> with rb_is_head_page(). The problem is that rb_is_head_page() may
>> return RB_PAGE_MOVED if the head_page has changed to another
>> page, and that is what has happened as the above messages show.
> I don't see where it said that.
> If RB_PAGE_MOVED is returned in rb_set_head_page then something is very
> broken. Because that is only returned if the reader modified the code.
> And since we only allow one reader at a time (we have locks to protect
> that), and the rb_set_head_page is only called by the reader, then this
> would mean another reader is reading the ring buffer.
> I should add a:
>        if ((ret = rb_is_head_page(cpu_buffer, page, page->list.prev))) {
>                RB_WARN_ON(ret == RB_PAGE_MOVED);
>                cpu_buffer->head_page = page;
>                return page;
>        }
I added the RB_WARN_ON(ret == RB_PAGE_MOVED) in rb_set_head_page()
as you suggested and I think it has helped me figure out the problem.

I saw a warning triggered by this WARN_ON this morning and realized
that although we are not doing read from interrupt context, we sometimes
call ring_buffer_empty() from a timer interrupt handler that checks whether
there is new data coming in the trace buffer and if so wakes up the
user-space reader. ring_buffer_empty() calls rb_set_head_page()
that can move the head_page. As far as I understand, it should be
ok to have ring_buffer_empty() preempt a writer so I guess we should leave
that RB_WARN_ON out from rb_set_head_page(). The problem in our case
is that we use our own locking mechanism to guarantee a single reader
instead of using the cpu_buffer->reader_lock so the reader is not synchronized
with ring_buffer_empty(). So when ring_buffer_empty() is called while
we are in the process of swapping the reader_page and head_page, the
head_page pointer can point to the old head, i.e., the new reader_page,
and we will enter into an infinite loop.

I wrapped our rb_get_reader_page() calls with cpu_buffer->reader_lock
spinlock and it seems to have solved the problem.

Thank you very much for the help!


>> Shouldn't we just return 0 in case that head_page has moved so that
>> we can move to the next page in the loop inside rb_set_head_page()?
> No, when the reader moves the page, the RB_PAGE_MOVED forces the writer
> to go into the conflict path (conflict between writer and reader).
> -- Steve
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at