Re: copy on write for splice() from file to pipe?

From: Jens Axboe
Date: Fri Feb 10 2023 - 17:51:17 EST


On 2/10/23 3:35?PM, Linus Torvalds wrote:
> On Fri, Feb 10, 2023 at 2:26 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
>>>
>>> (I actually suspect that /dev/zero no longer works as a splice source,
>>> since we disabled the whole "fall back to regular IO" that Christoph
>>> did in 36e2c7421f02 "fs: don't allow splice read/write without
>>> explicit ops").
>>
>> Yet another one... Since it has a read_iter, should be fixable with just
>> adding the generic splice_read.
>
> I actually very consciously did *not* want to add cases of
> generic_splice_read() "just because we can".
>
> I've been on a "let's minimize the reach of splice" thing for a while.
> I really loved Christoph's patches, even if I may not have been hugely
> vocal about it. His getting rid of set/get_fs() got rid of a *lot* of
> splice pain.
>
> And rather than try to make everything work with splice that used to
> work just because it fell back on read/write, I was waiting for actual
> regression reports.
>
> Even when splice fails, a lot of user space then falls back on
> read/write, and unless there is some really fundamental reason not to,
> I think that's always the right thing to do.
>
> So we do have a number of "add splice_write/splice_read" commits, but
> they are hopefully all the result of people actually noticing
> breakage.
>
> You can do
>
> git log --grep=36e2c7421f02
>
> to see at least some of them, and I really don't want to see them
> without a "Reported-by" and an actual issue.

Oh I already did that the last few times (and there's quite a bit). And
while I agree that getting rid of the ancient set/get_fs bits was great,
it is still annoying to knowingly have regressions. The problem with
this approach is that the time from when you start the "experiment" to
when the first report comes in, it'll take a while as most don't run
-git kernels at all. And the ones that do are generally just standard
distro setups on their workstation/laptop.

The time is my main concern, it takes many years before you're fully
covered. Maybe if that series had been pushed to stable as well we'd
have a better shot at weeding them out.

> Exactly because I'm not all that enamoured with splice any more.

I don't think anyone has been for years, I sure have not and haven't
worked on it in decades outside of exposing some for io_uring.The
latter was probably a mistake and we should've done something else, but
there is something to be said for the devil you know... Outside of that,
looks like the last real change was support for bigger pipes in 2010.
But:

1) Interesting bits do come up. Some of these, as this discussion has
highlighted, are probably better served somewhere else, especially if
they require changes/additions. Some may be valid and fine.

2) Knowingly breaking things isn't very nice, and if anyone else did
that, they'd have you screaming at them.

So while I do kind of agree with you on some points, I don't think it
was done very well from that perspective. And when we spot things like
zero not working with splice, we should probably add the patch to make
it work rather than wait for someone to complain. I just recently had to
fixup random/urandom for that because of a report, and I'd like to think
I have better things to do than deal with known fallout.

--
Jens Axboe