Re: Splicing to/from a tty

From: Al Viro
Date: Wed Jan 20 2021 - 11:31:45 EST


On Mon, Jan 18, 2021 at 07:54:00PM +0000, Al Viro wrote:
> On Mon, Jan 18, 2021 at 11:46:42AM -0800, Linus Torvalds wrote:
> > On Mon, Jan 18, 2021 at 11:35 AM Al Viro <viro@zeniv-ca> wrote:
> > >
> > > I'd rather have sendfile(2) do what splice(2) does and handle pipes
> > > directly. Let me take a look,,,
> >
> > It's not technically just the sendfile() thing. Some things do
> > sendfile "internally" (ie they use do_splice_direct()).
> >
> > Although maybe they always happen just on non-pipe destinations (ie
> > file-to-file copy). I didn't check.
> >
> > But particularly if it can be handled in do_splice_direct(), that
> > would be a good thing.
>
> FWIW, it might make sense to merge do_splice_direct() and do_splice();
> interfaces are very similar and do_splice() is basically
> if both are pipes
> ....
> else if input is pipe
> ....
> else if output is pipe
> ....
> else
> return -EINVAL
> with do_splice_direct() being fairly close to the missing case.

Yecchhh... The situation with checks is interesting.
do_splice():
in needs FMODE_READ, out - FMODE_WRITE [EBADF]
if in is a pipe, off_in must be NULL. [ESPIPE]
if out is a pipe, off_out must be NULL. [ESPIPE]
if in is not a pipe
non-NULL off_in is allowed only with FMODE_PREAD [EINVAL]
rw_verify_area done on in
if out is not a pipe
non-NULL off_out is allowed only with FMODE_PWRITE [EINVAL]
no O_APPEND on out [EINVAL]
rw_verify_area done on out
either in or out must be a pipe [EINVAL]
do_splice_direct():
out needs FMODE_WRITE [EBADF]
rw_verify_area done on out
no O_APPEND on out [EINVAL]
Callers:
__do_splice() -> do_splice():
if in is a pipe, off_in must be NULL. [ESPIPE, duplicate]
if out is a pipe, off_out must be NULL. [ESPIPE, duplicate]

io_uring io_splice() -> do_splice():
no extra checks

do_sendfile() -> do_splice_direct():
in needs FMODE_READ [EBADF]
out needs FMODE_WRITE [EBADF, duplicate]
non-NULL off_in is allowed only with FMODE_PREAD [EINVAL]
for out we act as splice with NULL off_out (== use ->f_pos)
rw_verify_area done on in
rw_verify_area done on out [duplicate]

vfs_copy_file_range() -> do_copy_file_range() ->
-> generic_copy_file_range() or __ceph_copy_file_range() ->
-> do_splice_direct():
both in and out must be regular [EINVAL or EISDIR]
in needs FMODE_READ [EBADF]
out needs FMODE_WRITE [EBADF]
no O_APPEND on out [EBADF]
on immutable for out [EPERM]
rw_verify_area done on in
rw_verify_area done on out [duplicate]
FMODE_P{READ,WRITE} is ignored. I wonder what happens if
somebody starts playing with copy_file_range(2) on
e.g. procfs or sysfs...

ovl_copy_up_date() -> do_splice_direct():
both in and out are hopefully regular - I'm not sure that copyup
logics is sufficiently protected, TBH.
FMODE_READ on in and FMODE_WRITE on out are guaranteed
lack of O_APPEND on out is guaranteed
FMODE_P{READ,WRITE} is completely ignored. Should be OK, modulo
the same issues with protection against manipulation of
layers.

Checks in vfs_copy_file_range() need to remain there - do_splice_direct() is
not guaranteed downstream of that.

AFAICS, we have the following:
in must have FMODE_READ
out must have FMODE_WRITE
no O_APPEND on out, unless a pipe[1]
for non-pipe out we do rw_verify_area
for non-pipe in we do rw_verify_area
no offsets for pipes

The lack of offsets for pipes should've been "not without FMODE_P{READ,WRITE}",
but that's not entirely consistent - we have splice(2) fail with EINVAL in
case of non-NULL off_in on a file that doesn't allow pread(), except that
when this file happens to be a pipe (which obviously doesn't allow pread())
we fail with ESPIPE instead. Note that for pread(2)/pwrite(2)/sendfile(2)
we fail with consistent ESPIPE in all such cases.

We also do not check FMODE_PREAD/FMODE_PWRITE for copy_file_range()/copyup.
Which is probably wrong.

Another inconsistency is that rw_verify_area() in copyup is done for
output (once for each chunk), but not for input. The tricky part here
is that there's an LSM hook in the end of rw_verify_area()...

Can we turn "not a pipe, but lacks FMODE_PREAD" error value for splice(2)
from EINVAL to ESPIPE? Would make the logics easier to consolidate...
Right now the checks are spread over many layers of several call chains
and rather hard to follow.

[1] yes, it is possible to have O_APPEND opened pipes - open a FIFO with
O_APPEND and you've got it. We are not quite consistent in handling
those - sendfile() to such is rejected, splice() is not.