Re: Some questions about linux kernel.

From: Jesse Pollard (pollard@cats-chateau.net)
Date: Sun Mar 19 2000 - 15:23:00 EST


On Sun, 19 Mar 2000, James Sutherland wrote:
>On Fri, 17 Mar 2000 18:05:17 -0600, you wrote:
>>On Thu, 16 Mar 2000, James Sutherland wrote:
>>>On Thu, 16 Mar 2000, Paul Jakma wrote:
>>>> On Thu, 16 Mar 2000, James Sutherland wrote:
>>>>
>>>> > No. *ANY* memory allocation system can run out of memory. Avoiding
>>>> > "overcommitting" would make the OOM situation arise SOONER (and more
>>>> > frequently), as well as killing performance.
>>>> >
>>>>
>>>> well, it's a more a question of whether you make promises that you might
>>>> not be able to keep. If you do (ie overcommit) then it's your
>>>> (kernel) problem. If you don't, it's not.
>>>
>>>Not really; either way, we are being asked for memory we can't currently
>>>provide.
>>>
>>>> Without overcommit the /system/ can run out memory, of course - it's
>>>> finite - but it's no longer a kernel problem.
>>>
>>>It was never really a "kernel problem" as such anyway. It's a problem the
>>>kernel is trying to fix.
>>
>>And where it couldn't detect the beginning of the problem.
>
>The beginning of the problem? The problem is that we don't have enough
>VM - every malloc() call since the system booted contributes.

That is a philosophical answer and is unusable.

The beginning of the problem is process/process group that is allocating more
than their alotted quota.

>
>>>> > Right... Now we'll try this on the university's central Unix system, shall
>>>> > we? Let's see... 6000 users, 2Gb RAM+swap. They get about 300K each.
>>>> > That's ALMOST enough to log in with!
>>>>
>>>> well then get more ram/swap. But at least it has become a hw issue.
>>>
>>>It was all along - you don't have enough RAM+swap for the workload the
>>>system is under, so processes are dying.
>>
>>The problem is detecting when it is running out of memory.
>
>It's fairly obvious, IMO - the "free memory" numbers get low. malloc()
>calls start failing. Processes start dying.

Too late - it has already run out. "when it is running out" is before
it has already run out.

>
>(snip)
>>Thats why per user and per process resource limits need to be implemented.
>
>Yes - but what's this got to do with overcommit?

1. Controling overcommit
2. Identifing the process using too much
3. preventing that process from killing the system
4. making it possible to determine how much is to be added to the system.

>
>(snip)
>>>Yes, I agree. However, we can still run out of RAM+swap on multiuser
>>>systems quite easily.
>>
>>Only if they are mismanaged - allowed to have too many concurrent users,
>>with too many concurrent processes. I am part of a center that runs
>>UNIX systems that don't have this problem. The per user resource limits
>>are taken into account for the number of concurrent interactive users,
>>and the number of batch queues (with concurrent jobs). The total amount
>>of required resources is then allocated. There IS no OOM on this system.
>>Can't happen.
>
>No, you have just isolated the users from each other a little better
>than normal. In doing so, you have placed very severe restrictions on
>their use of the system; this wouldn't be tolerated here.
>
>OK, once in a blue moon, a user's rogue process blows up and grabs
>half the system's VM. It then gets killed by Rik's patch. No problems
>at all, other than the system being slowed down a bit.

It may not be a user's rogue process - it may be working exactly as it should,
loading up a large data set.

Unfortunately - that doesnt always work - it is better than random kill, but
not to management. It is a stopgap work around that doesn't help identify
who, what, and where the memory is being taken.

>
>>>> > No, it's a risk with *EVERY* OS.
>>>>
>>>> no it's not. A non overcommiting OS doesn't run out of VM for
>>>> processes. it cleanly grants or denies memory requests. What the app
>>>> does after that is not a kernel problem.
>>>
>>>The problem is, denying memory requests leads to processes dying. This is
>>>what we want to avoid.
>>
>>No it isn't. What you want to avoid is system crashes/reboots.
>
>If your system crashes or reboots just because a program has taken all
>the VM, it is a very poor system indeed. Even Windows NT doesn't do
>this.

But linux does...
>
>> - If a user
>>process chooses to exceed its' resource limits then that process is aborted. The
>>system doesn't crash. The process to abort is clearly known.
>
>Exactly what happens with Rik's patch anyway - only the limits are set
>high enough to allow the user to use more memory, *if* this can be
>done without any problems.

It isn't complete - I need to be able to control the memory allocation. I
need to prevent the system from crashing.

>>>In fact, last night I more or less killed this machine. I had overcommit
>>>turned off, and nothing major loaded (X, WWW server, xmms) and fired up
>>>"make -j" on a biggish source tree. After quite some time, just about
>>>everything died from lack of memory.
>>
>>too bad - user resource limits would have signaled the out of resources to
>>the make process (it should have been given the OOM signal when it forked/execed
>>the compiler/linkger). This could then be used by make to determine that
>>resource limits had been reached, and not to spawn more. Yes this does
>>require make to be modified. Otherwise the make would have aborted; BUT
>>the system would not fail.
>
>The system didn't fail - but the user's other processes all died due
>to a lack of memory. Setting a lower VM quota for myself would just
>have made the problem worse - all my processes would have died
>earlier.

But the system would have continued to run. If that user can justify expanding
the resources, then management can allocate money/time/disk space to expand
the limited resource.

>>>This was WITHOUT the OOM killer patch. With it, I think the newly spawned
>>>make and cc processes would have been killed off much sooner, helping save
>>>the other processes.
>>>
>>>
>>>How would you define "running out" of memory? I would define it as not
>>>being able to fulfil new requests - which is the situation we are trying
>>>to handle here. If you start denying malloc() requests, processes start
>>>dying (largely at random). If, instead, you kill selected processes, then
>>>obviously processes still die as a result. However, they are more likely
>>>to be the "right" ones to kill.
>>
>>running out of memory: insufficient resources available to a specific
>>user to continue running processes.
>
>>Denying malloc() requests should not "largly at random" cause processes
>>to die; A very specific process will die - the one making the sbrk system
>>call. A different user would not see a problem since they should have a
>>different set of resources. The general system daemons would not see a
>>problem either - they also would have a different set of reserved resources.
(I should have expanded this from "sbrk" to "sbrk/fork")
>
>Fine provided you have enough system resources - which you don't
>always. This is where Rik's patch is needed. Once the system has run
>out of VM, ANY process making a malloc() call will be denied the
>request, for obvious reasons - and will then die as a result.

If I don't have enough resources, then I expect the system it limit the
damage to the user asking for too much. No other users should be affected.
The user can then request management to allocate/obtain more.

>OK, you can often restrict the process death to one particular user -
>but that user would still have random processes dying, instead of the
>biggest memory hog.

It is not random. A specific user/process is killed. The system doesn't die
the system doesn't reboot, the other usrs of the system are not affected.

If the biggest memory hog is authorized the amount it desires, then it
should get it. That is determined by management, not the OS.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@cats-chateau.net

Any opinions expressed are solely my own.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:27 EST