Re: [PATCH v3 10/11] rlimits: implement prlimit64 syscall

From: Andrew Morton
Date: Thu May 13 2010 - 18:57:33 EST


On Mon, 10 May 2010 20:00:50 +0200
Jiri Slaby <jslaby@xxxxxxx> wrote:

> This patch adds the code to support the sys_prlimit64 syscall which
> modifies-and-returns the rlim values of a selected process
> atomically. The first parameter, pid, being 0 means current process.
>
> Unlike the current implementation, it is a generic interface,
> architecture indepentent so that we needn't handle compat stuff
> anymore. In the future, after glibc start to use this we can deprecate
> sys_setrlimit and sys_getrlimit in favor to clean up the code finally.
>
> It also adds a possibility of changing limits of other processes. We
> check the user's permissions to do that and if it succeeds, the new
> limits are propagated online. This is good for large scale
> applications such as SAP or databases where administrators need to
> change limits time by time (e.g. on crashes increase core size). And
> it is unacceptable to restart the service.
>
> For safety, all rlim users now either use accessors or doesn't need
> them due to
> - locking
> - the fact a process was just forked and nobody else knows about it
> yet (and nobody can't thus read/write limits)
> hence it is safe to modify limits now.
>
> The limitation is that we currently stay at ulong internal
> representation. So we use the rlim64_is_infinity check where we
> compare to ULONG_MAX on 32-bit which is the maximum value there.
>
> And since internally we hold limits in struct rlimit, we introduce
> converters used before and after do_prlimit call in sys_prlimit64.
>

Is this worth all the new code and the increase in locking dependencies
which I think is there?

This could all be done in userspace, couldn't it? Write a little library
which clones a thread then waits for someone to send it a
change-your-rlimits message. Write a little tool to send those
messages and voila.

>
> ...
>
> +SYSCALL_DEFINE4(prlimit64, pid_t, pid, unsigned int, resource,
> + const struct rlimit64 __user *, new_rlim,
> + struct rlimit64 __user *, old_rlim)
> +{
> + struct rlimit64 old64, new64;
> + struct rlimit old, new;
> + struct task_struct *tsk;
> + int ret;
> +
> + if (new_rlim) {
> + if (copy_from_user(&new64, new_rlim, sizeof(new64)))
> + return -EFAULT;
> + rlim64_to_rlim(&new64, &new);
> + }
> +
> + /* we don't want to fail after do_rlimit */

You meant "do_prlimit".

If doesn't explain _why_ we don't want to fail. And it shold do so,
because the __copy_to_user() can still fail.


> + if (old_rlim && !access_ok(VERIFY_WRITE, old_rlim, sizeof(old64)))
> + return -EFAULT;
> +
> + rcu_read_lock();
> + tsk = pid ? find_task_by_vpid(pid) : current;
> + if (!tsk) {
> + rcu_read_unlock();
> + return -ESRCH;
> + }
> + ret = check_prlimit_permission(tsk);
> + if (ret) {
> + rcu_read_unlock();
> + return ret;
> + }
> + get_task_struct(tsk);
> + rcu_read_unlock();
> +
> + ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL,
> + old_rlim ? &old : NULL);
> +
> + if (!ret && old_rlim) {
> + rlim_to_rlim64(&old, &old64);
> + if (WARN_ON_ONCE(__copy_to_user(old_rlim, &old64,
> + sizeof(old64))))
> + ret = -EFAULT;
> + }
> +
> + put_task_struct(tsk);
> + return ret;
> +}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/