Re: [PATCH] pipe_read: don't wake up the writer if the pipe is still full

From: Hillf Danton
Date: Mon Mar 10 2025 - 06:55:25 EST


On Sun, 9 Mar 2025 18:02:55 +0100 Oleg Nesterov
>
> Well. Prateek has already provide the lengthy/thorough explanation,
> but let me add anyway...
>
lengthy != correct

> On 03/08, Hillf Danton wrote:
> > On Fri, 7 Mar 2025 13:34:43 +0100 Oleg Nesterov <oleg@xxxxxxxxxx>
> > > On 03/07, Oleg Nesterov wrote:
> > > > On 03/07, Hillf Danton wrote:
> > > > > On Fri, 7 Mar 2025 11:54:56 +0530 K Prateek Nayak <kprateek.nayak@xxxxxxx>
> > > > > >> step-03
> > > > > >> task-118766 new reader
> > > > > >> makes pipe empty
> > > > > >
> > > > > >Reader seeing a pipe full should wake up a writer allowing 118768 to
> > > > > >wakeup again and fill the pipe. Am I missing something?
> > > > > >
> > > > > Good catch, but that wakeup was cut off [2,3]
> > >
> > > Please note that "that wakeup" was _not_ removed by the patch below.
> > >
> > After another look, you did cut it.
>
> I still don't think so.
>
> > Link: https://lore.kernel.org/all/20250209150718.GA17013@xxxxxxxxxx/
> ...
> > --- a/fs/pipe.c
> > +++ b/fs/pipe.c
> > @@ -360,29 +360,9 @@ anon_pipe_read(struct kiocb *iocb, struct iov_iter *to)
> > break;
> > }
> > mutex_unlock(&pipe->mutex);
> > -
> > /*
> > * We only get here if we didn't actually read anything.
> > *
> > - * However, we could have seen (and removed) a zero-sized
> > - * pipe buffer, and might have made space in the buffers
> > - * that way.
> > - *
> > - * You can't make zero-sized pipe buffers by doing an empty
> > - * write (not even in packet mode), but they can happen if
> > - * the writer gets an EFAULT when trying to fill a buffer
> > - * that already got allocated and inserted in the buffer
> > - * array.
> > - *
> > - * So we still need to wake up any pending writers in the
> > - * _very_ unlikely case that the pipe was full, but we got
> > - * no data.
> > - */
> > - if (unlikely(wake_writer))
> > - wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
> > - kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
> > -
> > - /*
> > * But because we didn't read anything, at this point we can
> > * just return directly with -ERESTARTSYS if we're interrupted,
> > * since we've done any required wakeups and there's no need
> > @@ -391,7 +371,6 @@ anon_pipe_read(struct kiocb *iocb, struct iov_iter *to)
> > if (wait_event_interruptible_exclusive(pipe->rd_wait, pipe_readable(pipe)) < 0)
> > return -ERESTARTSYS;
> >
> > - wake_writer = false;
> > wake_next_reader = true;
> > mutex_lock(&pipe->mutex);
> > }
>
> Please note that in this particular case (hackbench testing)
> pipe_write() -> copy_page_from_iter() never fails. So wake_writer is
> never true before pipe_reader() calls wait_event(pipe->rd_wait).
>
Given never and the BUG_ON below, you accidentally prove that Prateek's
comment is false, no?

> So (again, in this particular case) we could apply the patch below
> on top of Linus's tree.
>
> So, with or without these changes, the writer should be woken up at
> step-03 in your scenario.
>
Fine, before checking my scenario once more, feel free to pinpoint the
line number where writer is woken up, with the change below applied.

> Oleg.
> ---
>
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -360,27 +360,7 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
> }
> mutex_unlock(&pipe->mutex);
>
> - /*
> - * We only get here if we didn't actually read anything.
> - *
> - * However, we could have seen (and removed) a zero-sized
> - * pipe buffer, and might have made space in the buffers
> - * that way.
> - *
> - * You can't make zero-sized pipe buffers by doing an empty
> - * write (not even in packet mode), but they can happen if
> - * the writer gets an EFAULT when trying to fill a buffer
> - * that already got allocated and inserted in the buffer
> - * array.
> - *
> - * So we still need to wake up any pending writers in the
> - * _very_ unlikely case that the pipe was full, but we got
> - * no data.
> - */
> - if (unlikely(wake_writer))
> - wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
> - kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
> -
> + BUG_ON(wake_writer);
> /*
> * But because we didn't read anything, at this point we can
> * just return directly with -ERESTARTSYS if we're interrupted,
>
>