Re: [PATCH] include private data mappings in RLIMIT_DATA limit
From: Hugh Dickins
Date: Tue Jul 10 2007 - 12:59:09 EST
On Mon, 9 Jul 2007, Dave McCracken wrote:
> On Monday 09 July 2007, Herbert van den Bergh wrote:
> > With this patch, not only memory in the data segment of a process, but
> > also private data mappings, both file-based and anonymous, are counted
> > toward the RLIMIT_DATA resource limit. ÂExecutable mappings, such as
> > text segments of shared objects, are not counted toward the private data
> > limit. ÂThe result is that malloc() will fail once the combined size of
> > the data segment and private data mappings reaches this limit.
> >
> > This brings the Linux behavior in line with what is documented in the
> > POSIX man page for setrlimit(3p).
Which says malloc() can fail from it, but conspicuously not that mmap()
can fail from it: unlike the RLIMIT_AS case. Would we be better off?
>
> I believe this patch is a simple and obvious fix to a hole introduced when
> libc malloc() began using mmap() instead of brk().
But didn't libc start doing that many years ago? Wouldn't that have
been the time for such a patch rather than now: when it can only break
apps that are currently working?
> We took away the ability
> to control how much data space processes could soak up. This patch returns
> that control to the user.
I remember thinking that the idea of data ulimit had become obsolete
(just preserved for compatibility) back when mmap() got invented and
used for dynamic libraries. I think that's when they brought in
RLIMIT_AS, something which could make sense in the mmap() world.
This patch does give it more meaning. But if we are prepared to
take the risk of breaking things in this way (I think not but don't
mind being corrected), it would be more accurate to take writability
into account, and use that quantity (sadly not stored in mm_struct,
would have to be added) which we do security_vm_enough_memory() upon,
which totals up into /proc/meminfo's Committed_AS.
That change to /proc/PID/status VmData:
- data = mm->total_vm - mm->shared_vm - mm->stack_vm;
+ data = mm->total_vm - mm->shared_vm - mm->stack_vm - mm->exec_vm;
looks plausible, but isn't exec_vm already counted as shared_vm,
so now being doubly subtracted? Besides which, we wouldn't want
to change those numbers again without consulting Albert.
Hugh