Re: Avoiding OOM on overcommit...?

From: Marco Colombo (marco@esi.it)
Date: Tue Mar 28 2000 - 11:11:19 EST


On Tue, 28 Mar 2000, Jesse Pollard wrote:

> Marco Colombo <marco@esi.it>
> > On Mon, 27 Mar 2000, Linda Walsh wrote:
> >
> > [...]
> > > Well -- that's sorta the point -- Everything from 'atd' to 'vi'
> > > would need to be rewritten to 'touch' pages of alloc'ed memory. If you want
> > > to promise integrity -- then you can choose to run with no 'virtual swap' and
> > > guaranteed _minimum_ stack space sizes allocated at run time. With the
> > > current model, say, auditd could think it malloc'ed a 2Meg buffer -- thus
> > ^^^^^^^^^
> > > it thinks it has it's space guaranteed. If we are in a OOM state, when auditd
> > > goes to access that buffer, it will SEGV -- can't map address to physical
> > > object, or a "OOM" killer routine runs and kills another process pseudo
> > > randomly. What I'm saying is we need to provide a model that doesn't
> > > overcommit. You, personally, or anyone else doesn't have to use that model.
> > > But such a model, if in the kernel would allow for operational assurance
> > > (allowing for failures to occur predictably).
> >
> > If you use plain malloc(), you're not allowed to think you have any
> > space guaranteed. It's bad programming. If you need guaranteed "space"
> > (memory) use another kernel interface, such as mlock(). I'm not saying
> > the current interface is perfect. I'm just saying that overcommitting
> > is not the problem. You don't need to turn overcommiting off. You
> > need you use a better interface than malloc() to get "safe" memory.
> > For stack grow, maybe we need some way to tell the kernel:
> > "never page-out my stack, and reserve me this space...".
> > Applications should be able to bypass kernel management of their address
> > space. But this should be done on a per-app base.
> >
> > [...]
>
> There is nothing wrong with sbrk or malloc. The kernel just doesn't
> really provide the use of the address range when requested. ANY method
> of touching memory after the sbrk call is subject to race conditions.
> It just may not occur in a single processor environment very often.
> In the SMP world it can easily occur.

That's precisely why "you are not allowed to think you have any space
guaranteed" after malloc(). If you assume malloc() allocates anything,
that's bad programming. There are other ways to get "guaranteed space".
I've never said there's something wrong with brk(). I've said that
an aplication MAY need to take control over the mm of its address space.
Most of the applications need not, so they use just malloc(). And get
killed at OOM.

> Malloc returns null if allocation fails - it can fail only when the kernel
> tells the process it will not allocate more via sbrk.
>
> "never page-out my stack" is unreasonable, especially for large jobs
> that may need a very large stack. The stack frames may not be referenced
> very often. It isn't that often that large buffers are created there.
> There are pointers to buffers there, and usually only the current stack
> frame is used for these.

Large jobs, with a large stack, usually let their stack grow in a
uncontrolled way.
So they can't predict how much stack they need, and can't ask for any
"reservation", either memory or swap. They'll risk to run out of stack
anyway.
Every process that asks for a "reserved" swap space for its stack, runs
with a "fixed" stack. So has to keep stack utilization under control
(e.g. no recursion). You can keep your stack small, if you control it.
So, just providing a way to mlock() your stack (its maximum size),
should be enough. No need to do swap reservation.

> -------------------------------------------------------------------------
> Jesse I Pollard, II
> Email: pollard@navo.hpc.mil
>
> Any opinions expressed are solely my own.

.TM.

-- 
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@ESI.it

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:22 EST