Re: [RFC PATCH] fs: allow open(dir, O_TMPFILE|..., 0) with mode 0
From: Andy Lutomirski
Date: Thu Oct 30 2014 - 17:49:01 EST
On 10/30/2014 02:08 AM, Eric Rannaud wrote:
> The man page for open(2) indicates that when O_CREAT is specified, the
> 'mode' argument applies only to future accesses to the file:
>
> Note that this mode applies only to future accesses of the newly
> created file; the open() call that creates a read-only file
> may well return a read/write file descriptor.
>
> The man page for open(2) implies that 'mode' is treated identically by
> O_CREAT and O_TMPFILE.
>
> O_TMPFILE, however, behaves differently:
>
> int fd = open("/tmp", O_TMPFILE | O_RDWR, 0);
> assert(fd == -1);
> assert(errno == EACCES);
>
> int fd = open("/tmp", O_TMPFILE | O_RDWR, 0600);
> assert(fd > 0);
>
> For O_CREAT, do_last() sets acc_mode to MAY_OPEN only:
>
> if (*opened & FILE_CREATED) {
> /* Don't check for write permission, don't truncate */
> open_flag &= ~O_TRUNC;
> will_truncate = false;
> acc_mode = MAY_OPEN;
> path_to_nameidata(path, nd);
> goto finish_open_created;
> }
>
> But for O_TMPFILE, do_tmpfile() passes the full op->acc_mode to
> may_open().
>
> This patch lines up the behavior of O_TMPFILE with O_CREAT. After the
> inode is created, may_open() is called with acc_mode = MAY_OPEN, in
> do_tmpfile().
>
> A different, but related glibc bug revealed the discrepancy:
> https://sourceware.org/bugzilla/show_bug.cgi?id=17523
>
> The glibc lazily loads the 'mode' argument of open() and openat() using
> va_arg() only if O_CREAT is present in 'flags' (to support both the 2
> argument and the 3 argument forms of open; same idea for openat()).
> However, the glibc ignores the 'mode' argument if O_TMPFILE is in
> 'flags'.
>
> On x86_64, for open(), it magically works anyway, as 'mode' is in
> RDX when entering open(), and is still in RDX on SYSCALL, which is where
> the kernel looks for the 3rd argument of a syscall.
>
> But openat() is not quite so lucky: 'mode' is in RCX when entering the
> glibc wrapper for openat(), while the kernel looks for the 4th argument
> of a syscall in R10. Indeed, the syscall calling convention differs from
> the regular calling convention in this respect on x86_64. So the kernel
> sees mode = 0 when trying to use glibc openat() with O_TMPFILE, and
> fails with EACCES.
Looks sensible. Should this be Cc: stable?
--Andy
>
> Signed-off-by: Eric Rannaud <e@xxxxxxxxxxxxxxxx>
> ---
> fs/namei.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 42df664e95e5..78512898d3ba 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3154,7 +3154,8 @@ static int do_tmpfile(int dfd, struct filename *pathname,
> if (error)
> goto out2;
> audit_inode(pathname, nd->path.dentry, 0);
> - error = may_open(&nd->path, op->acc_mode, op->open_flag);
> + /* Don't check for other permissions, the inode was just created */
> + error = may_open(&nd->path, MAY_OPEN, op->open_flag);
> if (error)
> goto out2;
> file->f_path.mnt = nd->path.mnt;
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/