pipes && EPOLLET, again

From: Oleg Nesterov
Date: Tue Mar 04 2025 - 11:12:30 EST


Linus,

On 03/04, Oleg Nesterov wrote:
>
> and we need to cleanup the poll_usage
> logic first.

We have already discussed this before, I'll probably do this later,
but lets forget it for now.

Don't we need the trivial one-liner below anyway?

I am not saying this is a bug, but please consider

#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/epoll.h>
#include <assert.h>

static char buf[16 * 4096];

int main(void)
{
int pfd[2], efd;
struct epoll_event evt = { .events = EPOLLIN | EPOLLET };

pipe(pfd);
efd = epoll_create1(0);
epoll_ctl(efd, EPOLL_CTL_ADD, pfd[0], &evt);

write(pfd[1], buf, 4096);
assert(epoll_wait(efd, &evt, 1, 0) == 1);

if (!fork()) {
write(pfd[1], buf, sizeof(buf));
assert(0);
}

sleep(1);
assert(epoll_wait(efd, &evt, 1, 0) == 1);

return 0;
}

the 2nd epoll_wait() fails, despite the fact that the child has already
written 15 * PAGE_SIZE bytes. This doesn't look consistent to me...

Oleg.
---

diff --git a/fs/pipe.c b/fs/pipe.c
index b0641f75b1ba..8a32257cc74f 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -554,7 +554,7 @@ anon_pipe_write(struct kiocb *iocb, struct iov_iter *from)
* become empty while we dropped the lock.
*/
mutex_unlock(&pipe->mutex);
- if (was_empty)
+ if (was_empty || pipe->poll_usage)
wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
wait_event_interruptible_exclusive(pipe->wr_wait, pipe_writable(pipe));