Re: [fuse-devel] fuse_get_context() and namespaces

From: Miklos Szeredi
Date: Tue May 26 2015 - 11:19:30 EST


On Fri, May 22, 2015 at 01:59:32PM -0500, Seth Forshee wrote:
> On Fri, May 22, 2015 at 12:44:35PM -0500, Eric W. Biederman wrote:
> > Seth Forshee <seth.forshee@xxxxxxxxxxxxx> writes:
> >
> > > On Fri, May 22, 2015 at 04:23:55PM +0200, Miklos Szeredi wrote:
> > >> On Sat, May 2, 2015 at 5:56 PM, <alexey@xxxxxxxxxxxxxxx> wrote:
> > >> >
> > >> > 3.10.0-229 form Scientific Linux and native 4.0.1-1 (from elrepo).
> > >> > SL 7.1 on the host and SL 6.6 on the LXC guest. At least in 3.10
> > >> > the 499dcf2024092e5cce41d05599a5b51d1f92031a is present.
> > >> > Steps to reproduce:
> > >> >
> > >> > On first console:
> > >> > [root@sl7test ~]# lxc-start -n test-2 /bin/su -
> > >> > [root@test-2 ~]# diff -u hello.py /usr/share/doc/fuse-python-0.2.1/example/hello.py
> > >> > --- hello.py 2015-05-02 11:12:13.963093580 -0400
> > >> > +++ /usr/share/doc/fuse-python-0.2.1/example/hello.py 2010-04-14 18:29:21.000000000 -0400
> > >> > @@ -41,8 +41,6 @@
> > >> > class HelloFS(Fuse):
> > >> >
> > >> > def getattr(self, path):
> > >> > - dic = Fuse.GetContext(self)
> > >> > - print dic
> > >> > st = MyStat()
> > >> > if path == '/':
> > >> > st.st_mode = stat.S_IFDIR | 0755
> > >> > [root@test-2 ~]# python hello.py -f /mnt/
> > >> >
> > >> > On second console:
> > >> > [root@test-2 ~]# echo $$
> > >> > 41
> > >> > [root@test-2 ~]# ls /mnt/
> > >> > hello
> > >> >
> > >> > Output of first console:
> > >> > {'gid': 0, 'pid': 12083, 'uid': 0}
> > >>
> > >> Thanks.
> > >>
> > >> Digging in mailbox... There was a thread last year about adding
> > >> support for running fuse daemon in a container:
> > >>
> > >> http://thread.gmane.org/gmane.linux.kernel/1811658
> > >>
> > >> Not sure what happened, but no updated patches have been posted or
> > >> maybe I just missed them.
> > >
> > > I haven't sent updated patches in a while. I still intend to, but I
> > > shifted focus to first getting general support for mounts from user
> > > namespaces into the vfs (which will give a clearer direction for some of
> > > the concerns raised about the fuse patches).
> > >
> > > All of this code is available in the userns-mounts branch of
> > > git://kernel.ubuntu.com/sforshee/linux.git, and I don't think the fuse
> > > patches actually depend on any of the stuff that precedes them. I'm
> > > planning to start submitting some of the earlier patches from that
> > > branch soon, and eventually get back to resubmitting the fuse patches.
> > >
> > > This is about pid namespaces though, and the fuse pid namespace patch
> > > from that series (see below) should be more or less independent of the
> > > rest of the patches. Potentially that could be merged separately from
> > > the user namespae stuff.
> >
> > [snip]
> >
> > > @@ -2076,7 +2077,15 @@ static int convert_fuse_file_lock(const struct fuse_file_lock *ffl,
> > >
> > > fl->fl_start = ffl->start;
> > > fl->fl_end = ffl->end;
> > > - fl->fl_pid = ffl->pid;
> > > +
> > > + /*
> > > + * Convert pid into the connection's pid namespace. If the
> > > + * pid does not map into the namespace fl_pid will get set
> > > + * to 0.
> > > + */
> > > + rcu_read_lock();
> > > + fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns));
> > > + rcu_read_unlock();
> >
> > Scratches my head. This looks wrong.
> >
> > I would have expected pid_nr_ns. Am I missing something reading this
> > patch quickly?
>
> Here we're in the context of a F_GETLK operation. We've requested the
> lock information from the fuse process, and ffl->pid is the pid number
> in that process's pid namespace so it needs to be translated into
> current's namespace. First we have to look up the struct pid, then
> pid_vnr is just a wrapper for pid_nr_ns in the current pid namespace:
>
> pid_t pid_vnr(struct pid *pid)
> {
> return pid_nr_ns(pid, task_active_pid_ns(current));
> }
>
> Oh, but the comment is wrong, so maybe that's what confused you.
> s/connection/caller/ there and it should make more sense.

Attaching updated patch against fuse.git for-next. Check namespace in both
device read and write. Check them at the start (doesn't matter if requests are
stuck in the queue, if server isn't playing by the rules, then all is lost
anyway).

One thing: we return error if current tgid isn't valid in server's namespace.
That's looks good. However we silently succeed and set in.h.pid to zero if
current pid isn't representable in the server's namespace. That doesn't sound
quite right.

Again the question is, does it make sense to allow access by tasks whose pid
isn't representable in the server. If not, then they should be rejected instead
of succeeding with an invalid PID, right?

Thanks,
Miklos
----

From: Seth Forshee <seth.forshee@xxxxxxxxxxxxx>
Date: Wed, 2 Jul 2014 16:29:19 -0500
Subject: fuse: Add support for pid namespaces

If the userspace process servicing fuse requests is running in
a pid namespace then pids passed via the fuse fd need to be
translated relative to that namespace. Capture the pid namespace
in use when the filesystem is mounted and use this for pid
translation.

As with user namespaces, since no use case currently exists for
changing namespaces so all translations are done relative to the
pid namespace in use when /dev/fuse is opened. Mounting or
/dev/fuse IO from another namespace will return errors. This
restriction can be relaxed at a later time if needed.

File locking changes based on previous work done by Eric
Biederman.

Signed-off-by: Seth Forshee <seth.forshee@xxxxxxxxxxxxx>
Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxx>
---
fs/fuse/dev.c | 15 +++++++++++----
fs/fuse/file.c | 22 +++++++++++++++++-----
fs/fuse/fuse_i.h | 4 ++++
fs/fuse/inode.c | 3 +++
4 files changed, 35 insertions(+), 9 deletions(-)

--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -19,6 +19,7 @@
#include <linux/pipe_fs_i.h>
#include <linux/swap.h>
#include <linux/splice.h>
+#include <linux/sched.h>

MODULE_ALIAS_MISCDEV(FUSE_MINOR);
MODULE_ALIAS("devname:fuse");
@@ -124,11 +125,11 @@ static void __fuse_put_request(struct fu
atomic_dec(&req->count);
}

-static void fuse_req_init_context(struct fuse_req *req)
+static void fuse_req_init_context(struct fuse_conn *fc, struct fuse_req *req)
{
req->in.h.uid = from_kuid_munged(&init_user_ns, current_fsuid());
req->in.h.gid = from_kgid_munged(&init_user_ns, current_fsgid());
- req->in.h.pid = current->pid;
+ req->in.h.pid = pid_nr_ns(task_pid(current), fc->pid_ns);
}

void fuse_set_initialized(struct fuse_conn *fc)
@@ -181,7 +182,7 @@ static struct fuse_req *__fuse_get_req(s
goto out;
}

- fuse_req_init_context(req);
+ fuse_req_init_context(fc, req);
__set_bit(FR_WAITING, &req->flags);
if (for_background)
__set_bit(FR_BACKGROUND, &req->flags);
@@ -274,7 +275,7 @@ struct fuse_req *fuse_get_req_nofail_nop
if (!req)
req = get_reserved_req(fc, file);

- fuse_req_init_context(req);
+ fuse_req_init_context(fc, req);
__set_bit(FR_WAITING, &req->flags);
__clear_bit(FR_BACKGROUND, &req->flags);
return req;
@@ -1243,6 +1244,9 @@ static ssize_t fuse_dev_do_read(struct f
struct fuse_in *in;
unsigned reqsize;

+ if (task_active_pid_ns(current) != fc->pid_ns)
+ return -EIO;
+
restart:
spin_lock(&fiq->waitq.lock);
err = -EAGAIN;
@@ -1872,6 +1876,9 @@ static ssize_t fuse_dev_do_write(struct
struct fuse_req *req;
struct fuse_out_header oh;

+ if (task_active_pid_ns(current) != fc->pid_ns)
+ return -EIO;
+
if (nbytes < sizeof(struct fuse_out_header))
return -EINVAL;

--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2061,7 +2061,8 @@ static int fuse_direct_mmap(struct file
return generic_file_mmap(file, vma);
}

-static int convert_fuse_file_lock(const struct fuse_file_lock *ffl,
+static int convert_fuse_file_lock(struct fuse_conn *fc,
+ const struct fuse_file_lock *ffl,
struct file_lock *fl)
{
switch (ffl->type) {
@@ -2076,7 +2077,14 @@ static int convert_fuse_file_lock(const

fl->fl_start = ffl->start;
fl->fl_end = ffl->end;
- fl->fl_pid = ffl->pid;
+
+ /*
+ * Convert pid into the caller's pid namespace. If the pid
+ * does not map into the namespace fl_pid will get set to 0.
+ */
+ rcu_read_lock();
+ fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns));
+ rcu_read_unlock();
break;

default:
@@ -2125,7 +2133,7 @@ static int fuse_getlk(struct file *file,
args.out.args[0].value = &outarg;
err = fuse_simple_request(fc, &args);
if (!err)
- err = convert_fuse_file_lock(&outarg.lk, fl);
+ err = convert_fuse_file_lock(fc, &outarg.lk, fl);

return err;
}
@@ -2137,7 +2145,8 @@ static int fuse_setlk(struct file *file,
FUSE_ARGS(args);
struct fuse_lk_in inarg;
int opcode = (fl->fl_flags & FL_SLEEP) ? FUSE_SETLKW : FUSE_SETLK;
- pid_t pid = fl->fl_type != F_UNLCK ? current->tgid : 0;
+ struct pid *pid = fl->fl_type != F_UNLCK ? task_tgid(current) : NULL;
+ pid_t pid_nr = pid_nr_ns(pid, fc->pid_ns);
int err;

if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
@@ -2149,7 +2158,10 @@ static int fuse_setlk(struct file *file,
if (fl->fl_flags & FL_CLOSE)
return 0;

- fuse_lk_fill(&args, file, fl, opcode, pid, flock, &inarg);
+ if (pid && pid_nr == 0)
+ return -EOVERFLOW;
+
+ fuse_lk_fill(&args, file, fl, opcode, pid_nr, flock, &inarg);
err = fuse_simple_request(fc, &args);

/* locking is restartable */
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -22,6 +22,7 @@
#include <linux/rbtree.h>
#include <linux/poll.h>
#include <linux/workqueue.h>
+#include <linux/pid_namespace.h>

/** Max number of pages that can be used in a single read request */
#define FUSE_MAX_PAGES_PER_REQ 32
@@ -456,6 +457,9 @@ struct fuse_conn {
/** The group id for this mount */
kgid_t group_id;

+ /** The pid namespace for this mount */
+ struct pid_namespace *pid_ns;
+
/** The fuse mount flags for this mount */
unsigned flags;

--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -20,6 +20,7 @@
#include <linux/random.h>
#include <linux/sched.h>
#include <linux/exportfs.h>
+#include <linux/pid_namespace.h>

MODULE_AUTHOR("Miklos Szeredi <miklos@xxxxxxxxxx>");
MODULE_DESCRIPTION("Filesystem in Userspace");
@@ -609,6 +610,7 @@ void fuse_conn_init(struct fuse_conn *fc
fc->connected = 1;
fc->attr_version = 1;
get_random_bytes(&fc->scramble_key, sizeof(fc->scramble_key));
+ fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
}
EXPORT_SYMBOL_GPL(fuse_conn_init);

@@ -617,6 +619,7 @@ void fuse_conn_put(struct fuse_conn *fc)
if (atomic_dec_and_test(&fc->count)) {
if (fc->destroy_req)
fuse_request_free(fc->destroy_req);
+ put_pid_ns(fc->pid_ns);
fc->release(fc);
}
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/