Re: Overcommitable memory??

From: James Sutherland (jas88@cam.ac.uk)
Date: Mon Mar 20 2000 - 08:19:28 EST


On 20 Mar 2000 1:8:46 +0100, you wrote:

>Den 16-Mar-00 16:08:24 skrev James Sutherland følgende om "Re: Overcommitable memory??":
>> On 15 Mar 2000, Rask Ingemann Lambertsen wrote:
>
>> You can still run out of memory, with or without COW, overcommit etc. I
>> think we are talking at cross purposes here: By running out of memory, I
>> mean there isn't any free memory to give to apps requesting it (via
>> malloc() etc.)
>
> Right. See below.
>
>> If Netscape blows up and grabs all my memory, other applications can't get
>> any - so I have problems. I need Netscape to be killed to fix the problem
>
> Lots of people participating in this discussion will tell you that apps
>just die when they get a null return from malloc(), so the problem should
>solve itself when Netscape gets null from malloc() and gives up.
>
> Actually, just having overcommitment turned off will itself tend to keep
>a little memory free for someone to log in and sort out the situation.
>Let's say the system is down to 500 kB of free memory (RAM+swap) and a
>program requests 600 kB. Now, without overcommitment of memory, the
>allocation will fail and the system will still have 500 kB of free memory
>left, all your daemons will still be running, etc.

No; malloc() still fails, because the 600K allocation exceeds free VM.
Only in the extreme case, "suicide mode" will the situation you
describe arise. (This could be useful if you have code allocating a
single large sparse matrix, for example.)

> You'd need a malicious
>program behaviour to run the system completely out of memory. If however,
>memory is overcommitted, the program will be allowed to use more memory and
>gradually eat away memory, one page at a time, until there is not even a
>single free page left in the system, at which point the OOM killer has to
>kill one or more processes, possibly some that were doing just fine with
>the memory they had already allocated. Any program, including well behaved
>ones, could cause that situation.

You are forgetting that disabling overcommit has reduced the available
VM significantly - by the time my overcommit system is really OOM,
yours barfed hours earlier because it had reserved VM just in case.

>> - and I can't log in to do it manually. THIS is the OOM situation I am
>> thinking of, not a kernel-level problem of being unable to handle a page
>> fault.
>
> Right. Two different kinds of OOM situations are:
>
>1) Applications can't allocate memory because malloc() (or any other memory
>allocation mechanism) fails. It is a real possibility, and applications
>will have to be able to deal with it. Preventing this situation from taking
>a system down is a question of system administration.
>
>2) An application accesses a part of its allocated address space and the
>kernel can't handle it because all RAM and swap has been used. This can
>only happen when the kernel has overcommitted memory. Short of suspending
>the currently running process until memory is freed, there isn't much the
>kernel can do except to kill a process and free memory that way.

Why had the process allocated unused memory anyway? In the normal case
(allocate a buffer, then use it) overcommit never comes into play. It
is only if you malloc() a fairly large buffer (a few K or more) and
then do not use it until much later that there is a potential problem.

Just touch memory when you allocate it. Then, you will never have
these problems with implicit allocation of memory (except on the
stack...)

> Since 2) can be avoided simply by not overcommitting memory, it should
>at least be possible to turn it off.

Situation 2 is a BIGGER problem without overcomit. Instead of failing
if there is a REAL problem, the application fails because of a
POTENTIAL problem. No real improvement.

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:29 EST