Re: 4.9.0 regression in pipe-backed iov_iter with systemd-nspawn

From: Al Viro
Date: Fri Jan 13 2017 - 16:55:13 EST


On Fri, Jan 13, 2017 at 08:47:31PM +0000, Al Viro wrote:
> On Fri, Jan 13, 2017 at 12:32:37PM -0800, Linus Torvalds wrote:
>
> > Ugh. I still think your patch is butt-ugly, and the index comparisons
> > make me nervous, but..
>
> No arguments here - 6am on 20-odd hours of uptime is _not_ a good time
> for writing, especially since the data structure needs better documentation
> and probably a couple of inlined helpers. I'll try to massage the damn
> thing into more readable form.

FWIW, I think it will be more readable if we separate the "advance" and
"truncate" parts like this (warning: not even build-tested). Comments?

static inline void pipe_truncate(struct iov_iter *i)
{
struct pipe_inode_info *pipe = i->pipe;
if (pipe->nrbufs) {
size_t off = i->iov_offset;
int idx = i->idx;
int n;

n = (pipe->curbuf + pipe->nrbufs - idx) & (pipe->buffers - 1);
if (off) {
pipe->bufs[idx].len = off - pipe->bufs[idx].offset;
/* free all after idx; n can't be 0 */
idx = next_idx(idx, pipe);
n--;
} else {
/* free all _starting_ at idx.
* n is 0 when we have nothing to do
* *or* when we are truncating full pipe to empty.
*/
if (pipe->nrbufs == pipe->buffers && !n)
n = pipe->buffers;
}
while (n--) {
pipe_buf_release(pipe, &pipe->bufs[idx]);
idx = next_idx(idx, pipe);
pipe->nrbufs--;
}
}
}

static void pipe_advance(struct iov_iter *i, size_t size)
{
struct pipe_inode_info *pipe = i->pipe;
if (unlikely(i->count < size))
size = i->count;
if (size) {
struct pipe_buffer *buf;
size_t off = i->iov_offset, left = size;
int idx = i->idx;
if (off) /* make it relative to the beginning of buffer */
left += off - pipe->bufs[idx].offset;
while (1) {
buf = &pipe->bufs[idx];
if (left <= buf->len)
break;
left -= buf->len;
idx = next_idx(idx, pipe);
}
i->idx = idx;
i->iov_offset = buf->offset + left;
i->count -= size;
}
/* ... and discard everything past that point */
pipe_truncate(i);
}