Re: [GIT] writable_limits for 2.6.36

From: Chris Metcalf
Date: Tue Aug 10 2010 - 12:21:51 EST


On 8/10/2010 12:01 PM, Linus Torvalds wrote:
> 2010/8/7 Jiri Slaby <jslaby@xxxxxxx>:
>
>> please consider the following repository for 2.6.36. It introduces a new
>> syscall for arch independent resource limits handling. It also adds a
>> support for runtime limits changing. This feature is needed mostly by
>> daemons servicing databases and similar service where limits are needed
>> to be changed without services being restarted on production systems.
>>
> Ok, so the code looks fine, and I don't have any real objections any
> more. I don't know how much use this will get, but it doesn't appear
> to be "wrong" in any way. So I was going to pull it.
>
> However, in the meantime we have commit 5360bd776f73 ("Fix up the
> "generic" unistd.h ABI to be more useful") that clashes with it. Now,
> the conflict is trivial to resolve, and I could do that easily - it's
> not a technical problem. But that commit code comments say
>
> + * Architectures may provide up to 16 syscalls of their own
> + * starting with this value.
> + */
> +#define __NR_arch_specific_syscall 244
>
> and the new writable rlimits syscall is obviously 244.
>

Jiri and I actually discussed this back on July 20th on LKML when it
first conflicted in linux-next, and at the time he said he'd move
prlimit64 to 261 in <asm-generic/unistd.h>. It looks like what actually
stuck in linux-next was different, however. It's partly my fault for
not following up on this.

> Now, looking at it all, I think that commit was badly done - not
> leaving any room for new generic system calls is pretty iffy. And if I
> had happened to take the Tilera merge later, I'd have had no problems
> with just changing it. As is, though, I want to check with Arnd and
> Chris first.
>

In any case, obviously the larger question is how many
architecture-specific syscalls are appropriate, and where they should be
located in the syscall number space. To be clear, the model for new
generic system calls is that they just continue on after the 16
architecture-specific ones, and in fact __NR_wait4 is already an example
of just this -- done that way to avoid making trouble for the "score"
architecture, since it was deprecated and then later un-deprecated. So
new generic syscalls are not a problem.

There is definitely some tension between allowing architectures free
reign with their own set of unlimited additional syscalls on the one
hand, and having a contiguous and small array of syscalls on the other
hand. I suspect it's slightly nicer to have a contiguous and small
array, as long as we've provided enough room for architectures to add
extra syscalls, but I'm not strongly married to this position.

For what it's worth, from Tilera's point of view we can certainly
tolerate changes in this area; we have not released any of this new
syscall ABI stuff to customers yet, so thrashing this just involves an
internal flag day for our developers, which is not too big a deal.

> Arnd, Chris - how about making the "arch-specific" system calls start
> at 256 or something? Or even higher, like 512? Yes, it makes the
> system call array bigger, but is that really a problem? Especially as
> we start the "deprecated" system calls at 1024, it would seem to make
> sense to raise it to 512, and leave the low numbers for the "regular"
> system calls.
>
> [ I'm leaving the quoted email for the edification of Chris/Arnd that
> I added to the discussion ]
>
> Linus
>
> ---
>
>> git://decibel.fi.muni.cz/~xslaby/linux writable_limits
>>
>> Jiri Slaby (10):
>> rlimits: security, add task_struct to setrlimit
>> rlimits: add task_struct to update_rlimit_cpu
>> rlimits: split sys_setrlimit
>> rlimits: allow setrlimit to non-current tasks
>> rlimits: do security check under task_lock
>> rlimits: add rlimit64 structure
>> rlimits: redo do_setrlimit to more generic do_prlimit
>> rlimits: switch more rlimit syscalls to do_prlimit
>> rlimits: implement prlimit64 syscall
>> unistd: add __NR_prlimit64 syscall numbers
>>
>> Oleg Nesterov (2):
>> rlimits: make sure ->rlim_max never grows in sys_setrlimit
>> rlimits: selinux, do rlimits changes under task_lock
>>
>> arch/x86/ia32/ia32entry.S | 1 +
>> arch/x86/include/asm/unistd_32.h | 3 +-
>> arch/x86/include/asm/unistd_64.h | 2 +
>> arch/x86/kernel/syscall_table_32.S | 1 +
>> include/asm-generic/unistd.h | 4 +-
>> include/linux/posix-timers.h | 2 +-
>> include/linux/resource.h | 9 ++
>> include/linux/security.h | 9 +-
>> include/linux/syscalls.h | 4 +
>> kernel/compat.c | 17 +---
>> kernel/posix-cpu-timers.c | 8 +-
>> kernel/sys.c | 202 ++++++++++++++++++++++++++++--------
>> security/capability.c | 3 +-
>> security/security.c | 5 +-
>> security/selinux/hooks.c | 12 ++-
>> 15 files changed, 207 insertions(+), 75 deletions(-)
>>
>> thanks,
>> --
>> js
>> suse labs
>>
>>

--
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/