Re: [PATCH] binderfs: implement "max" mount option
From: Greg KH
Date: Sun Dec 23 2018 - 06:29:56 EST
On Sat, Dec 22, 2018 at 10:18:06PM +0100, Christian Brauner wrote:
> Since binderfs can be mounted by userns root in non-initial user namespaces
> some precautions are in order. First, a way to set a maximum on the number
> of binder devices that can be allocated per binderfs instance and second, a
> way to reserve a reasonable chunk of binderfs devices for the initial ipc
> namespace.
> A first approach as seen in [1] used sysctls similiar to devpts but was
> shown to be flawed (cf. [2] and [3]) since some aspects were unneeded. This
> is an alternative approach which avoids sysctls completely and instead
> switches to a single mount option.
>
> Starting with this commit binderfs instances can be mounted with a limit on
> the number of binder devices that can be allocated. The max=<count> mount
> option serves as a per-instance limit. If max=<count> is set then only
> <count> number of binder devices can be allocated in this binderfs
> instance.
Ok, this is fine, but why such a big default? You only need 4 to run a
modern android system, and anyone using binder outside of android is
really too crazy to ever be using it in a container :)
> Additionally, the binderfs instance in the initial ipc namespace will
> always have a reserve of at least 1024 binder devices unless explicitly
> capped via max=<count>.
Again, why so many? And why wouldn't that initial ipc namespace already
have their device nodes created _before_ anything else is mounted?
Some comments on the patch below:
> +/*
> + * Ensure that the initial ipc namespace always has a good chunk of devices
> + * available.
> + */
> +#define BINDERFS_MAX_MINOR_CAPPED (BINDERFS_MAX_MINOR - 1024)
Again that seems crazy big, how about splitting this into two different
patches, one for the max= stuff, and one for this "reserve some minors"
thing, so we can review them separately.
>
> static struct vfsmount *binderfs_mnt;
>
> @@ -46,6 +52,24 @@ static dev_t binderfs_dev;
> static DEFINE_MUTEX(binderfs_minors_mutex);
> static DEFINE_IDA(binderfs_minors);
>
> +/**
> + * binderfs_mount_opts - mount options for binderfs
> + * @max: maximum number of allocatable binderfs binder devices
> + */
> +struct binderfs_mount_opts {
> + int max;
> +};
> +
> +enum {
> + Opt_max,
> + Opt_err
> +};
> +
> +static const match_table_t tokens = {
> + { Opt_max, "max=%d" },
> + { Opt_err, NULL }
> +};
> +
> /**
> * binderfs_info - information about a binderfs mount
> * @ipc_ns: The ipc namespace the binderfs mount belongs to.
> @@ -55,13 +79,16 @@ static DEFINE_IDA(binderfs_minors);
> * created.
> * @root_gid: gid that needs to be used when a new binder device is
> * created.
> + * @mount_opts: The mount options in use.
> + * @device_count: The current number of allocated binder devices.
> */
> struct binderfs_info {
> struct ipc_namespace *ipc_ns;
> struct dentry *control_dentry;
> kuid_t root_uid;
> kgid_t root_gid;
> -
> + struct binderfs_mount_opts mount_opts;
> + atomic_t device_count;
Why atomic?
You should already have the lock held every time this is accessed,
so no need to use an atomic value, just use an int.
> /* Reserve new minor number for the new device. */
> mutex_lock(&binderfs_minors_mutex);
> - minor = ida_alloc_max(&binderfs_minors, BINDERFS_MAX_MINOR, GFP_KERNEL);
> + if (atomic_inc_return(&info->device_count) < info->mount_opts.max)
No need for atomic, see, your lock is held :)
thanks,
greg k-h