Re: splice() from /dev/zero to a pipe does not work (5.9+)

From: Colin Ian King
Date: Wed May 12 2021 - 11:29:29 EST


On 07/05/2021 19:21, Kees Cook wrote:
> On Fri, May 07, 2021 at 07:05:51PM +0100, Colin Ian King wrote:
>> Hi,
>>
>> While doing some micro benchmarking with stress-ng I discovered that
>> since linux 5.9 the splicing from /dev/zero to a pipe now fails with
>> -EINVAL.
>>
>> I bisected this down to the following commit:
>>
>> 36e2c7421f02a22f71c9283e55fdb672a9eb58e7 is the first bad commit
>> commit 36e2c7421f02a22f71c9283e55fdb672a9eb58e7
>> Author: Christoph Hellwig <hch@xxxxxx>
>> Date: Thu Sep 3 16:22:34 2020 +0200
>>
>> fs: don't allow splice read/write without explicit ops
>>
>> I'm not sure if this has been reported before, or if it's intentional
>> behavior or not. As it stands, it's a regression in the stress-ng splice
>> test case.
>
> The general loss of generic splice read/write is known. Here's some
> early conversations I was involved in:
>
> https://lore.kernel.org/lkml/20200818200725.GA1081@xxxxxx/
> https://lore.kernel.org/lkml/202009181443.C2179FB@keescook/
> https://lore.kernel.org/lkml/20201005204517.2652730-1-keescook@xxxxxxxxxxxx/
>
> And it's been getting re-implemented in individual places:
>
> $ git log --oneline --no-merges --grep 36e2c742
> 42984af09afc jffs2: Hook up splice_write callback
> a35d8f016e0b nilfs2: make splice write available again
> f8ad8187c3b5 fs/pipe: allow sendfile() to pipe again
> f2d6c2708bd8 kernfs: wire up ->splice_read and ->splice_write
> 9bb48c82aced tty: implement write_iter
> dd78b0c483e3 tty: implement read_iter
> 14e3e989f6a5 proc mountinfo: make splice available again
> c1048828c3db orangefs: add splice file operations
> 960f4f8a4e60 fs: 9p: add generic splice_write file operation
> cf03f316ad20 fs: 9p: add generic splice_read file operations
> 06a17bbe1d47 afs: Fix copy_file_range()

Ah..so this explains why copy_file_range() also returns -EINVAL now on
some file systems, such a minix since that uses splicing too via
do_splice_direct(). :-/

>
> So the question is likely, "do we want this for /dev/zero?"
>