Re: Some questions about linux kernel.

From: James Sutherland (jas88@cam.ac.uk)
Date: Sun Mar 19 2000 - 13:38:43 EST

Next message: Steve Dodd: "Re: Tentative patch: modularized disk partition systems in 2.3.99-pre2-5"
Previous message: Linus Torvalds: "Re: new IRQ scalability changes in 2.3.48"
In reply to: Jesse Pollard: "Re: Some questions about linux kernel."
Next in thread: Jesse Pollard: "Re: Some questions about linux kernel."
Reply: Jesse Pollard: "Re: Some questions about linux kernel."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, 17 Mar 2000 18:05:17 -0600, you wrote:
>On Thu, 16 Mar 2000, James Sutherland wrote:
>>On Thu, 16 Mar 2000, Paul Jakma wrote:
>>> On Thu, 16 Mar 2000, James Sutherland wrote:
>>>
>>> > No. *ANY* memory allocation system can run out of memory. Avoiding
>>> > "overcommitting" would make the OOM situation arise SOONER (and more
>>> > frequently), as well as killing performance.
>>> >
>>>
>>> well, it's a more a question of whether you make promises that you might
>>> not be able to keep. If you do (ie overcommit) then it's your
>>> (kernel) problem. If you don't, it's not.
>>
>>Not really; either way, we are being asked for memory we can't currently
>>provide.
>>
>>> Without overcommit the /system/ can run out memory, of course - it's
>>> finite - but it's no longer a kernel problem.
>>
>>It was never really a "kernel problem" as such anyway. It's a problem the
>>kernel is trying to fix.
>
>And where it couldn't detect the beginning of the problem.

The beginning of the problem? The problem is that we don't have enough
VM - every malloc() call since the system booted contributes.

>>> > Right... Now we'll try this on the university's central Unix system, shall
>>> > we? Let's see... 6000 users, 2Gb RAM+swap. They get about 300K each.
>>> > That's ALMOST enough to log in with!
>>>
>>> well then get more ram/swap. But at least it has become a hw issue.
>>
>>It was all along - you don't have enough RAM+swap for the workload the
>>system is under, so processes are dying.
>
>The problem is detecting when it is running out of memory.

It's fairly obvious, IMO - the "free memory" numbers get low. malloc()
calls start failing. Processes start dying.

(snip)
>Thats why per user and per process resource limits need to be implemented.

Yes - but what's this got to do with overcommit?

(snip)
>>Yes, I agree. However, we can still run out of RAM+swap on multiuser
>>systems quite easily.
>
>Only if they are mismanaged - allowed to have too many concurrent users,
>with too many concurrent processes. I am part of a center that runs
>UNIX systems that don't have this problem. The per user resource limits
>are taken into account for the number of concurrent interactive users,
>and the number of batch queues (with concurrent jobs). The total amount
>of required resources is then allocated. There IS no OOM on this system.
>Can't happen.

No, you have just isolated the users from each other a little better
than normal. In doing so, you have placed very severe restrictions on
their use of the system; this wouldn't be tolerated here.

OK, once in a blue moon, a user's rogue process blows up and grabs
half the system's VM. It then gets killed by Rik's patch. No problems
at all, other than the system being slowed down a bit.

>>> > No, it's a risk with *EVERY* OS.
>>>
>>> no it's not. A non overcommiting OS doesn't run out of VM for
>>> processes. it cleanly grants or denies memory requests. What the app
>>> does after that is not a kernel problem.
>>
>>The problem is, denying memory requests leads to processes dying. This is
>>what we want to avoid.
>
>No it isn't. What you want to avoid is system crashes/reboots.

If your system crashes or reboots just because a program has taken all
the VM, it is a very poor system indeed. Even Windows NT doesn't do
this.

> - If a user
>process chooses to exceed its' resource limits then that process is aborted. The
>system doesn't crash. The process to abort is clearly known.

Exactly what happens with Rik's patch anyway - only the limits are set
high enough to allow the user to use more memory, *if* this can be
done without any problems.

>>In fact, last night I more or less killed this machine. I had overcommit
>>turned off, and nothing major loaded (X, WWW server, xmms) and fired up
>>"make -j" on a biggish source tree. After quite some time, just about
>>everything died from lack of memory.
>
>too bad - user resource limits would have signaled the out of resources to
>the make process (it should have been given the OOM signal when it forked/execed
>the compiler/linkger). This could then be used by make to determine that
>resource limits had been reached, and not to spawn more. Yes this does
>require make to be modified. Otherwise the make would have aborted; BUT
>the system would not fail.

The system didn't fail - but the user's other processes all died due
to a lack of memory. Setting a lower VM quota for myself would just
have made the problem worse - all my processes would have died
earlier.

>>This was WITHOUT the OOM killer patch. With it, I think the newly spawned
>>make and cc processes would have been killed off much sooner, helping save
>>the other processes.
>>
>>
>>How would you define "running out" of memory? I would define it as not
>>being able to fulfil new requests - which is the situation we are trying
>>to handle here. If you start denying malloc() requests, processes start
>>dying (largely at random). If, instead, you kill selected processes, then
>>obviously processes still die as a result. However, they are more likely
>>to be the "right" ones to kill.
>
>running out of memory: insufficient resources available to a specific
>user to continue running processes.

>Denying malloc() requests should not "largly at random" cause processes
>to die; A very specific process will die - the one making the sbrk system
>call. A different user would not see a problem since they should have a
>different set of resources. The general system daemons would not see a
>problem either - they also would have a different set of reserved resources.

Fine provided you have enough system resources - which you don't
always. This is where Rik's patch is needed. Once the system has run
out of VM, ANY process making a malloc() call will be denied the
request, for obvious reasons - and will then die as a result.

OK, you can often restrict the process death to one particular user -
but that user would still have random processes dying, instead of the
biggest memory hog.

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

Next message: Steve Dodd: "Re: Tentative patch: modularized disk partition systems in 2.3.99-pre2-5"
Previous message: Linus Torvalds: "Re: new IRQ scalability changes in 2.3.48"
In reply to: Jesse Pollard: "Re: Some questions about linux kernel."
Next in thread: Jesse Pollard: "Re: Some questions about linux kernel."
Reply: Jesse Pollard: "Re: Some questions about linux kernel."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:27 EST