Re: [RFC] relinquish_fs() syscall

From: Mitchell Blank Jr
Date: Tue Nov 30 2004 - 08:30:49 EST


Alan Cox wrote:
> With CAP_SYS_RAWIO I can ask the IDE controller to DMA into the kernel
> as one example.

Can you really do that on normal file descriptors? Weird. I'd have thought
you'd need to open /dev/hd* to do that.

Oh, and you also need to revoke CAP_SYS_PTRACE to prevent a compromised
process from taking over a process inside the jail

> Without it should be sane. However if you take away all
> the capabilities then you don't need the other changes because mount
> works just fine.

True; for the root-user case you can build an equivalent jail as long as you
* remove CAP_SYS_RAWIO
* remove CAP_SYS_ADMIN to prevent mount/umount
* make the root somehow immutable (make mountpoint r/o; ext2 attribute;
whatever)

I probably shouldn't have even mentioned the jail-root case in my description;
my only point is that relinquish_fs() is at least as good as the currently
used chroot("/var/empty") in every way. It's also stronger in some
ways. Obviously a rogue process running with a full CAP_SYS_* set can
do all kinds of bad things even inside a jail (up to and including a
reboot)

The real point of the patch is to add a method giving this same power to
unprivileged processes, functionality which is currently lacking but
useful. The existing namespace support makes it almost trivial to provide.

> Yes I realise that but you also need to realise that glibc has a lot of
> supporting baggage it likes to find if its going to function sanely.

And how is this different then people using chroot() currently? The people
doing privsep designs understand the limits libc and can work within them.

> > > Imagine a jpeg decoder using an SELinux policy.
> >
> > SELinux is great but it's a pretty orthoginal issue. There's no reason the
> > two can't coexist.
>
> I don't see it as orthogonal except for the "done by user" aspect.

Yes and that's the point. relinquish_fs() is a tool for defensive
programming. SELinux is a tool for sysadmins and distribution vendors
to enforce policy. Ideally they would both be used -- defense in depth
is a good thing.

> BTW: you also have to deal with fchdir() and potentially AF_UNIX
> sockets.

Is AF_UNIX in a separate namespace? My understanding (from reading
unix_find_other()) is that unless you can create a UNIX socket in your
filesystem you're going to have trouble creating new UNIX sockets.
I think some *NIXes worked differently but linux seems sane in this
regard. So you'd have to trick your privsep-peer outside the jail to
send you a directory fd over a unix-domain socket you already had open
before you jailed yourself.

-Mitch
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/