Re: [PATCH 08/23] fs: don't change the address limit for ->write_iter in __kernel_write

From: Christoph Hellwig
Date: Thu Jul 30 2020 - 03:02:23 EST


On Wed, Jul 29, 2020 at 09:50:36PM +0100, Al Viro wrote:
> On Tue, Jul 07, 2020 at 07:47:46PM +0200, Christoph Hellwig wrote:
> > If we write to a file that implements ->write_iter there is no need
> > to change the address limit if we send a kvec down. Implement that
> > case, and prefer it over using plain ->write with a changed address
> > limit if available.
>
> You are flipping the priorities of ->write and ->write_iter
> for kernel_write().

Note by the end of the series (and what's been in linux-next for a while
now) there is no order, as kernel_write only uses ->write_iter, so a
few patches later this kinda becomes moot point.

> Now, there are 4 instances of file_operations
> where we have both. null_fops and zero_fops are fine either way -
> ->write() and ->write_iter() do the same thing there (and arguably
> removing ->write might be the right thing; the only reason I hesistate
> is that writing to /dev/null *is* critical for many things, including
> the proper mail delivery ;-)
>
> However, the other two (infinibarf and pcm) are different; there we
> really have different semantics. I don't believe anything writes into
> either under KERNEL_DS, but having kernel_write() and vfs_write() with
> subtly different semantics is asking for trouble down the road.
>
> How about we remove ->write in null_fops/zero_fops and fail loudly if
> *both* ->write() and ->write_iter() are present (in kernel_write(),
> that is)?

I'm fine with removing plain ->write for /dev/null and /dev/zero, as
that seems the right thing to do.

Failing the kernel ops if both are present sounds fine, I'm not sure
about the loud part as it could be user triggered through splice. I'd
go for the same kind of noticable not loud warning that we have for
the lack of iter ops in kernel_read/write.

> There's a similar situation on the read side - there we have /dev/null
> with both ->read() and ->read_iter() (and there "remove ->read" is
> obviously the right thing to do) *and* we have pcm crap, with different
> semantics for ->read() and ->read_iter().