Re: Avoiding OOM on overcommit...?

From: Linda Walsh (law@sgi.com)
Date: Mon Mar 27 2000 - 00:23:16 EST


Richard Gooch wrote:
> Each of these options is flawed:
>
> 1&2) you have to do this for all processes

---
	You have to assign process ID's, memory mappings, etc to every
process.  So what's your point?  Having a default guaranteed stack
space is automatic.  Creating extensions to the automatic default is the
responsibility of the programmer.  These are *minimums* to reserve.  Providing
a minimum is no *worse* behavior than the current provision of 'none'.  I'd
assert that it is *better* from the standpoint that I could guarantee a
fixed amount of stack space for each process -- and a user or programmer could
adjust that fixed 'minimum to ask for' upwards.  If the minimum is not there,
you get an unrunnable program from the start.  If it is and the minimum is
greater than maximum stack usage, you have guaranteed memory to run in
relation to the stack.

> > 3) you can't handle a signal caused by stack exhaustion --- Why not? If the default action is to kill the process running out of stack. Again, no worse than current and it affects the current process, not some unrelated process.

> > 4) deadlock. --- Deadlock is a risk on any system. There are no guarantees that resource exhaustion won't lead to deadlock -- in fact, I'd call this a predictable failure. Deadlock is not guaranteed, however -- processes *may* 'exit' or release memory. The behavior of the machine on lack of resources is a deterministic failure. Killing random processes is a non-deterministic failure.

And with regards to algorithms to select processes to kill. From the perspective of any user or running process it *is* random -- since no user or process is going to exactly know the cpu time, run time, memory usage and memory priority of every other process is. It is, more precisely, pseudo-random. I maintain that random results of an OOM condition are worse than predictable results even if the predictable result group includes deadlock.

> That was 4 null options. I'd settle for one that could be shown to > work, or at least reasonably expected to. --- My intention is for the state of the system after an OOM failure to be as predictable as possible. Ideally, completely predictable. I would say all of those achieve that end. So what is your definition of "null" and what is your proposal for deterministic failure that maintains the integrity of the system?

> Because there isn't always syscall interface for getting memory. --- But if you apply user and/or programmer defaults for stack, there is for for every other access. Thus you have a predictable result for both stack and other memory access. > > It's better not to claim something at all than claim it > and people find out later it's not true. --- How is the above scheme untrue in any of it's claims? -l

-- Linda A Walsh | Trust Technology, Core Linux, SGI law@sgi.com | Voice: (650) 933-5338

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:18 EST