Re: [PATCH] ring-buffer: Fix polling on trace_pipe
From: Chris Mason
Date: Tue Jul 15 2014 - 13:37:28 EST
On 06/10/2014 02:06 AM, Martin Lau wrote:
> ring_buffer_poll_wait() should always put the poll_table to its wait_queue
> even there is immediate data available. Otherwise, the following epoll and
> read sequence will eventually hang forever:
> 1. Put some data to make the trace_pipe ring_buffer read ready first
> 2. epoll_ctl(efd, EPOLL_CTL_ADD, trace_pipe_fd, ee)
> 3. epoll_wait()
> 4. read(trace_pipe_fd) till EAGAIN
> 5. Add some more data to the trace_pipe ring_buffer
> 6. epoll_wait() -> this epoll_wait() will block forever
> ~ During the epoll_ctl(efd, EPOLL_CTL_ADD,...) call in step 2,
> ring_buffer_poll_wait() returns immediately without adding poll_table,
> which has poll_table->_qproc pointing to ep_poll_callback(), to its
> ~ During the epoll_wait() call in step 3 and step 6,
> ring_buffer_poll_wait() cannot add ep_poll_callback() to its wait_queue
> because the poll_table->_qproc is NULL and it is how epoll works.
> ~ When there is new data available in step 6, ring_buffer does not know
> it has to call ep_poll_callback() because it is not in its wait queue.
> Hence, block forever.
> Other poll implementation seems to call poll_wait() unconditionally as the very
> first thing to do. For example, tcp_poll() in tcp.c.
Reviewed-by: Chris Mason <clm@xxxxxx>
This looked horribly wrong to me at first, but Martin walked me through
how the polling code is setting up waiters. We have it in production here.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/