Re: FUSE proxying for ABI filesystems?

From: Topi Miettinen
Date: Sun Mar 29 2015 - 16:17:51 EST


On 03/21/15 18:07, Topi Miettinen wrote:
> Hello all,
>
> I've made a small control program that intercepts and filters filesystem
> operations of processes launched by it with FUSE. With it, FS operations
> can be filtered by access type (e.g. getattr/read, cf. AppArmor or
> TOMOYO Linux) or for more fine grained control, which area of the file
> is being accessed. This lets me differentiate between, for example,
> 'bash -c exit' and 'bash -c "echo foo;exit"', which is far beyond what
> any current MAC can do. It works even with complex programs like
> iceweasel or chromium with only some slowdown on startup.
>
> But due to limitations of FUSE, ABI file systems etc. (/proc, /sys,
> certain devices) can't be intercepted very well. For example, it's
> pretty easy (maybe racy) to change readlink("/proc/self") to
> readlink("/proc/$PID_OF_CLIENT"). But handling the client opening TTY
> devices _without_ O_NOCTTY does not look so simple and there seems to be
> a number of other interesting cases. For more fun, the control program
> and its client can be in different namespaces and maybe even the client
> should be able to perform arbitrary mounting and namespace operations,
> even use FUSE recursively.
>
> I think how to manage this mess would be that it should be possible for
> the control program to switch temporarily its way of viewing and using
> ABI file systems in a way that setfsuid()/setfsgid() does not allow, but
> so that the above cases can be handled reliably.
>
> For example, a new system calls could be added like setfspid(pid_t
> client_pid) for /proc/self and TTY handling, and maybe something like
> setfsns() for namespace control.

First, I think adding setfspid() would be doable. It should not affect
FUSE, only a new system call would be available that the control program
could use while servicing access requests. I'll try to make a patch to
provoke more discussion.

The other issue, managing various namespace and filesystem/VFS
operations, looks pretty complex. To be comprehensive, the list of
managed operations should include at least (re)mounting and unmounting,
chroot(2), pivot_root(2), unshare(2), and setns(2). FUSE protocol could
be extended for proxying these operations, or a new interface could be
added (VFS in User Space). On kernel side, either approach would touch a
lot of places. By reusing FUSE protocol, the control program could
manage normal FUSE accesses and extended VFS accesses nicely.

But while pretty interesting, I'm not sure this would be worth the
effort because I think attempting to contain a heavily privileged
process by another user space process is probably doomed to fail. Kernel
level MAC systems (while not as fine grained and flexible as I want)
have better chances there.

However, controlling unprivileged programs like browsers is easier, so
simple FUSE approach (i.e. not implementing mount etc) should be good
enough for them and it can still bring added value over kernel's MAC system.

-Topi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/