Re: INFO: task hung in pipe_write (2)

From: Andreas Gruenbacher
Date: Mon Oct 14 2019 - 16:40:58 EST


Hi Darrick,

On Thu, Sep 19, 2019 at 11:10 PM Darrick J. Wong
<darrick.wong@xxxxxxxxxx> wrote:
> On Thu, Sep 19, 2019 at 10:55:44PM +0200, Rasmus Villemoes wrote:
> > On 19/09/2019 19.19, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: 288b9117 Add linux-next specific files for 20190918
> > > git tree: linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=17e86645600000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=f6126e51304ef1c3
> > > dashboard link:
> > > https://syzkaller.appspot.com/bug?extid=3c01db6025f26530cf8d
> > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11855769600000
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=143580a1600000
> > >
> > > The bug was bisected to:
> > >
> > > commit cfb864757d8690631aadf1c4b80022c18ae865b3
> > > Author: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > Date: Tue Sep 17 16:05:22 2019 +0000
> > >
> > > splice: only read in as much information as there is pipe buffer space
> >
> > The middle hunk (the one before splice_pipe_to_pipe()) accesses
> > opipe->{buffers, nrbufs}, but opipe is not locked at that point. So
> > maybe we end up passing len==0, which seems (once there's room in opipe)
> > it would put a zero-length pipe_buffer in opipe - and that probably
> > violates an invariant somewhere.
> >
> > But does the splice_pipe_to_pipe() case even need that extra logic?
> > Doesn't it handle short writes correctly already?
>
> Yep. I missed the part where splice_pipe_to_pipe is already perfectly
> capable of detecting insufficient space in opipe and kicking opipe's
> readers to clear out the buffer. So that hunk isn't needed, and now I'm
> wondering how in the other clause we return 0 from wait_for_space yet
> still don't have buffer space...
>
> Oh well, back to the drawing board. Good catch, though now it's become
> painfully clear that xfstests lacks rigorous testing of splice()...

have you had any luck figuring out how to fix this? We're still
suffering from the regression I've reported a while ago (*).

If not, I wonder if reverting commit 8f67b5adc030 would make sense for now.

* https://lore.kernel.org/linux-fsdevel/CAHpGcM+WQYFHOOC8SzKq+=DuHVZ4fw4RHLTMUDN-o6GX3YtGvQ@xxxxxxxxxxxxxx/T/#u

Thanks,
Andreas