[PATCH 00/13] devpts: New instances for every mount
From: Eric W. Biederman
Date: Mon Apr 04 2016 - 20:14:13 EST
To recap the situation for those who have not been following closely.
There are programs such as xen-create-image that run as root and setup
a chroot environment with:
"mknod dev/ptmx c 5 2"
"mkdir dev/pts"
"mount -t devpts none dev/pts"
Which mostly works but stomps the mount options of the system /dev/pts.
In particular the options of "gid=5,mode=620" are lost resulting in a
situation where creating a new pty by opening /dev/ptmx results in
that pty having the wrong permissions.
Some distributions have been working around this problem by continuing
to install a setuid root pt_chown binary that will be called by glibc
to fix the permissions.
This solution isn't too scary as long as there is only one instance
of devpts but add a second instance of devpts and it becomes possible
to trick the setuid root pt_chown binary into operating on the wrong
files and directories.
The following patchset attempts to dig use out of this mess by carefully
chaning devpts in a way that does not induce any userspace regressions.
My expectation is that userspace developers will love us as it makes
their problems go away, and kernel developers will not be happy with the
changes because what is required to preserve backwards compatibility is
not what anyone would have designed with a clean sheet implementation.
To dig our selves out of this mess it has been generally agreed that it
makes sense for each mount of devpts to result in a separate instance of
devpts. That works for ensuring that a new mount of devpts does not
stomp the permissions of another mount of devpts but then programs such
as xen-create-image break as they expect opening of /dev/ptmx to create
new ptys.
The problem of /dev/ptmx needing to create ptys on different instances
of devpts can be resolved by performing a path based look up from the
"/dev/ptmx" node to find the devpts filesystem.
How we get each mount of devpts to result in a distinct instance of
devpts matters. The kernel treats the system instance of devpts
specially and to maintain backwards compatibility that needs to be
preserved.
As there must be a system instance of devpts the code continues in it's
technique of mounting the system instance of devpts internally and then
exporting it userspace when devpts is mounted under the right
circumstances.
There is one pattern in userspace where an intial ramdisk mounts devpts
then unmounts devpts and the usual system startup scripts then mount
devpts on /dev/pts. I believe Centos5 and Openwrt use this pattern.
There is another pattern of userspace where devpts is mounted in the
initial ramdisk and then "mount --move" is used to move it onto the
final /dev/pts. However devpts remains listed in /etc/fstab and later
in the boot a "mount -a" honors that listing starts mounting devpts on
top of itself. Fails but only after succeeding in changing the mount
options. I believe this is the pattenr that Centos6 uses.
Then there is the question of which permissions checks should apply when
/dev/ptmx is opened. The way it works in v4.6-rc1 is that a pty on the
system instance of devtps can be created either by opening /dev/ptmx or
/dev/pts/ptmx (which happens to reside on the system instance. The only
restriction are the normal unix permission of opening those files. A
pty on a non-system instance of devpts must be created by opening the
ptmx file on the non-systme devpts instance. The default permisions for
/dev/ptmx are 0666 and for /dev/pts/ptmx 0000. Today for non-system
instances of devpts the permission on the ptmx file are always changed to
something else (typically 0666), and typically on the system instance
the permission of ptmx on devpts are ignoreed and left at 0000.
Which leaves the question of what to do about cases such as
xen-create-image where the new /dev/ptmx path based lookup is exercised
to find the mount of devpts. In that case out of an abundance of
caution I require the code to verify that the opener also has permission
to open the ptmx file on the non-system instance of devpts. To make
that work I have changed the default permissions on the ptmx file for
all non-system instance of devpts to 0666. Where the permission check
comes in as useful is on any system where a non-system instance of
devpts and has permissions on it's ptmx file that are not 0666. If that
was not done a simple bind mount of the /dev/ptmx device node would
allow overriding the policy of who is allowed to create ptys in an
instance of devpts.
As you can see above my guiding principle this round has been be very
careful to keep existing userspace working. I have tested this code on
as wide a range of distributions as I could to look for intesting
behavior. Those I managed to get setup and running in a vm for testing
are: on openwrt-15.05, centos5, centos6, centos7, debian-6.0.2,
debian-7.9, debian-8.2, ubuntu-14.04.3, ubuntu-15.10, fedora23, magia-5,
mint-17.3, opensuse-42.1, slackware-14.1, gentoo-20151225 (13.0?),
archlinux-2015-12-01.
As I could not find an image of Android that was easy to get running
in a VM, so I audited the code to see what Android does. Unlike
reports earlier in this conversation Android does not use a shell
script. Android has a daemon that listens on netlink for device events,
consults it's policy data base and creates the device node in a tmpfs
instance mounted on /dev according to policy (assuming the policy allows
that device node). Furthermore at the start of that daemon devpts is
mounted exactly once. Which thankfully means Android poses no special
problems for this patchset.
I have also run xen-create-image on debian 8.2 (where it was easily
installed with apt-get) and confirmed that without these changes it
stomps the mount options of devpts and with these changes it only uses
atypical mount options on a separate instance of devpts.
The first change in this series adds magic to /dev/ptmx. The rest of
the changes deal with all of the little issues needed to ensure that
every mount of devpts is a distinct instance.
Eric W. Biederman (13):
devpts: Teach /dev/ptmx to find the associated devpts via path lookup
devpts: More obvious check for the system devpts in pty allocation
devpts: Cleanup newinstance parsing
devpts: Stop rolling devpts_remount by hand in devpts_mount
devpts: Fail early (if appropriate) on overmount
devpts: Use the same default mode for both /dev/ptmx and dev/pts/ptmx
devpts: Move parse_mount_options into fill_super
devpts: Make devpts_kill_sb safe if fsi is NULL
devpts: Move the creation of /dev/pts/ptmx into fill_super
devpts: Simplify devpts_mount by using mount_nodev
vfs: Implement mount_super_once
devpts: Always return a distinct instance when mounting
devpts: Kill the DEVPTS_MULTIPLE_INSTANCE config option
Documentation/filesystems/devpts.txt | 122 +++++++----------
drivers/tty/Kconfig | 11 --
drivers/tty/pty.c | 4 +
drivers/tty/tty_io.c | 5 +-
fs/devpts/inode.c | 250 ++++++++++++++++++++---------------
fs/namei.c | 64 +++++++--
fs/namespace.c | 8 ++
fs/open.c | 19 +++
fs/super.c | 12 ++
include/linux/devpts_fs.h | 5 +
include/linux/fs.h | 5 +
include/linux/namei.h | 3 +
12 files changed, 302 insertions(+), 206 deletions(-)
This code is also available at:
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git devpts-for-testing
Eric