Re: [PATCH v4 5/6] add listmount(2) syscall
From: Linus Torvalds
Date: Thu Jan 11 2024 - 15:22:32 EST
On Thu, 11 Jan 2024 at 10:57, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> Any variance of put_user() with &buf[ctr] or buf + ctr fails
> if ctr is a variable and permitted to be != 0.
Crazy. But the 64-bit put_user() is a bit special and tends to require
more registers (the 64-bit value is passed in two registers), so that
probably then results in the ICE.
Side note: looking at the SH version of __put_user_u64(), I think it's
buggy and is missing the exception handler for the second 32-bit move.
I dunno, I don't read sh asm, but it looks suspicious.
> The following works. Would this be acceptable ?
It might be very easy to trigger this once again if somebody goes "that's silly"
That said, I also absolutely detest the "error handling" in that
function. It's horrible.
Noticing the user access error in the middle is just sad, and if that
was just handled better and at least the range was checked first, the
overflow error couldn't happen and checking for it is thus pointless.
And looking at it all, it really looks like the whole interface is
broken. The "bufsize" argument isn't the size of the buffer at all.
It's the number of entries.
Extra confusingly, in the *other* system call, bufsize is in fact the
size of the buffer.
And the 'ctr' overflow checking is doubly garbage, because the only
reason *that* can happen is that we didn't check the incoming
arguments properly.
Same goes for the whole array_index_nospec() - it's pointless, because
the user controls what that code checks against anyway, so there's no
point to trying to manage some range checking.
The only range checking there that matters would be the one that
put_user() has to do against the address space size, but that's done
by put_user().
End result: that thing needs a rewrite.
The SH put_user64() needs to be looked at too, but in the meantime,
maybe something like this fixes the problems with listmount?
NOTE! ENTIRELY untested, but that naming and lack of argument sanity
checking really is horrendous. We should have caught this earlier.
Linus
fs/namespace.c | 31 +++++++++++++++++--------------
1 file changed, 17 insertions(+), 14 deletions(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index ef1fd6829814..df74f4769733 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -5043,12 +5043,17 @@ static struct mount *listmnt_next(struct mount *curr)
}
static ssize_t do_listmount(struct mount *first, struct path *orig, u64 mnt_id,
- u64 __user *buf, size_t bufsize,
+ u64 __user *buf, size_t nentries,
const struct path *root)
{
struct mount *r;
- ssize_t ctr;
- int err;
+ const size_t maxentries = (size_t)-1 >> 3;
+ ssize_t ret;
+
+ if (unlikely(nentries > maxentries))
+ return -EFAULT;
+ if (!access_ok(buf, nentries * sizeof(*buf)))
+ return -EFAULT;
/*
* Don't trigger audit denials. We just want to determine what
@@ -5058,26 +5063,24 @@ static ssize_t do_listmount(struct mount *first, struct path *orig, u64 mnt_id,
!ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN))
return -EPERM;
- err = security_sb_statfs(orig->dentry);
- if (err)
- return err;
+ ret = security_sb_statfs(orig->dentry);
+ if (ret)
+ return ret;
- for (ctr = 0, r = first; r && ctr < bufsize; r = listmnt_next(r)) {
+ for (ret = 0, r = first; r && nentries; r = listmnt_next(r)) {
if (r->mnt_id_unique == mnt_id)
continue;
if (!is_path_reachable(r, r->mnt.mnt_root, orig))
continue;
- ctr = array_index_nospec(ctr, bufsize);
- if (put_user(r->mnt_id_unique, buf + ctr))
+ if (put_user(r->mnt_id_unique, buf))
return -EFAULT;
- if (check_add_overflow(ctr, 1, &ctr))
- return -ERANGE;
+ buf++, ret++; nentries--;
}
- return ctr;
+ return ret;
}
SYSCALL_DEFINE4(listmount, const struct mnt_id_req __user *, req,
- u64 __user *, buf, size_t, bufsize, unsigned int, flags)
+ u64 __user *, buf, size_t, nentries, unsigned int, flags)
{
struct mnt_namespace *ns = current->nsproxy->mnt_ns;
struct mnt_id_req kreq;
@@ -5111,7 +5114,7 @@ SYSCALL_DEFINE4(listmount, const struct mnt_id_req __user *, req,
else
first = mnt_find_id_at(ns, last_mnt_id + 1);
- ret = do_listmount(first, &orig, mnt_id, buf, bufsize, &root);
+ ret = do_listmount(first, &orig, mnt_id, buf, nentries, &root);
err:
path_put(&root);
up_read(&namespace_sem);