Re: [-mm][PATCH 3/4] Add rlimit controller accounting and control

From: Balbir Singh
Date: Thu May 08 2008 - 10:36:07 EST


Paul Menage wrote:
> On Sat, May 3, 2008 at 2:38 PM, Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
>>
>> This patch adds support for accounting and control of virtual address space
>> limits. The accounting is done via the rlimit_cgroup_(un)charge_as functions.
>> The core of the accounting takes place during fork time in copy_process(),
>> may_expand_vm(), remove_vma_list() and exit_mmap(). There are some special
>> cases that are handled here as well (arch/ia64/kernel/perform.c,
>> arch/x86/kernel/ptrace.c, insert_special_mapping())
>>
>
> The basic idea of the patches looks fine (apart from some
> synchronization issues) but Is calling this the "rlimit" controller a
> great idea? That implies that it handles all (or at least many) of the
> things that setrlimit()/getrlimit() handle.
>
> While some of the other rlimit things definitely do make sense as
> cgroup controllers, putting them all in the same controller doesn't
> really - paying for the address-space tracking overhead just to get,
> say, the equivalent of RLIMIT_NPROC (max tasks) isn't a great idea.
>
> Can you instead give this a name that somehow refers to virtual
> address space limits, e.g. "va" or "as". That would still fit if you
> expanded it to deal with locked virtual address space limits too.
>
> I think that an "rlimit" controller would probably be best for
> representing just those limits that don't really make sense when
> aggregated across different tasks, but apply separately to each task
> (e.g. RLIMIT_FSIZE, RLIMIT_CORE, RLIMIT_NICE, RLIMIT_NOFILE,
> RLIMIT_RTPRIO, RLIMIT_STACK, RLIMIT_SIGPENDING, and maybe RLIMIT_CPU),
> in order to provide an easy way to change these limits on a group of
> running tasks.
>

I currently intend to use this controller for controlling memory related
rlimits, like address space and mlock'ed memory. How about we use something like
"memrlimit"?


> On a separate note for the address-space tracking, ideally the
> subsystem would track whether or not it was bound to a hierarchy, and
> skip charging/uncharging if not. That way there's no (noticeable)
> overhead for compiling in the subsystem but not using it. At the point
> when the subsystem was bound to a hierarchy, it could at that point
> run through all mms and charge each one's existing address space to
> the appropriate cgroup. (Currently that would only be the root cgroup
> in the hierarchy).

Good suggestion, but it will be hard if not impossible to account the data
correctly as it changes, if we do the accounting/summation at bind time. We'll
need a really big lock to do it, something I want to avoid. Did you have
something else in mind?


--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/