Re: [PATCH 1/2] vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files

From: Linus Torvalds
Date: Sat Apr 13 2019 - 13:27:23 EST


On Sat, Apr 13, 2019 at 9:55 AM Kirill Smelkov <kirr@xxxxxxxxxx> wrote:
>
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -371,7 +371,7 @@ int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t
> inode = file_inode(file);
> if (unlikely((ssize_t) count < 0))
> return retval;
> - pos = *ppos;
> + pos = (ppos ? *ppos : 0);
> if (unlikely(pos < 0)) {
> if (!unsigned_offsets(file))
> return retval;

This part looks silly. We should just avoid all the position overflow
games when we don't have a position at all (ie streaming). You can't
overflow what you don't use.

Similarly, you can't use ranged mandatory locking on a stream, so the
mandatory locking thing seems dependent on pos too.

So I think that with a NULL ppos being possible, we should just change
the code to just do all of that conditionally on having a position,
rather than saying that the position of a stream is always 0.

That said, this whole "let's make it possible to not have a position
at all" is a big change, and there's no way I'll apply these before
the 5.2 merge window.

And I'd really like to have people (Al?) look at this and go "yeah,
makes sense". I do think that moving to a model where we wither have a
(properly locked) file position or no pos pointer at all is the right
model (ie I'd really like to get rid of the mixed case), but there
might be some practical problem that makes it impractical.

Because the *real* problem with the mixed case is not "insane people
who do bad things might get odd results". No, the real problem with
the mixed case is that it could be a security issue (ie: one process
intentionally changes the file position just as another process is
going a 'read' and then avoids some limit test because the limit test
was done using the old 'pos' value but the actual IO was done using
the new one).

So I suspect that we will have to either

- get rid of the mixed case entirely (and do only properly locked
f_pos accesses or pass is a NULL f_pos)

- continue to support the mixed case, but also continue to support
the nasty temporary 'pos' value with that file_pos_{read,write}()
thing.

IOW, I would not be ok with passing in a shared - and unlocked -
&file->f_pos value to random drivers that *might* do odd things when a
race happens.

Linus