Re: [PATCH v3] fs/splice: don't block splice_direct_to_actor() after data was read
From: Jan Kara
Date: Wed Jun 05 2024 - 12:19:16 EST
On Tue 04-06-24 21:24:14, Max Kellermann wrote:
> On Tue, Jun 4, 2024 at 3:27 PM Jan Kara <jack@xxxxxxx> wrote:
> > OK, so that was not clear to me (and this may well be just my ignorance of
> > networking details). Do you say that your patch changes the behavior only
> > for this cornercase? Even if the socket fd is blocking? AFAIU with your
> > patch we'd return short write in that case as well (roughly 64k AFAICT
> > because that's the amount the internal splice pipe will take) but currently
> > we block waiting for more space in the socket bufs?
>
> My patch changes only the file-read side, not the socket-write side.
> It adds IOCB_NOWAIT for reading from the file, just like
> filemap_read() does. Therefore, it does not matter whether the socket
> is non-blocking.
>
> But thanks for the reply - this was very helpful input for me because
> I have to admit that part of my explanation was wrong:
> I misunderstood how sending to a blocking socket works. I thought that
> send() and sendfile() would return after sending at least one byte
> (only recv() works that way), but in fact both block until everything
> has been submitted.
Yeah, this was exactly what I was trying to point at...
> I could change this to only use IOCB_NOWAIT if the destination is
> non-blocking, but something about this sounds wrong - it changes the
> read side just because the write side is non-blocking.
> We can't change the behavior out of fear of breaking applications; but
> can we have a per-file flag so applications can opt into partial
> reads/writes? This would be useful for all I/O on regular files (and
> sockets and everything else). There would not be any guarantees, just
> allowing the kernel to use relaxed semantics for those who can deal
> with partial I/O.
> Maybe I'm overthinking things and I should just fast-track full
> io_uring support in my code...
Adding open flags is messy (because they are different on different
architectures AFAIR 8-|) and I'm not sure we have much left. In principle
it is possible and I agree it seems useful. But I'm not sure how widespread
use it would get as, as you write above, these days the time may be better
spent on converting code with such needs to io_uring...
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR