Re: Avoiding OOM on overcommit...?

From: James Sutherland (jas88@cam.ac.uk)
Date: Fri Mar 24 2000 - 09:47:01 EST


On Tue, 21 Mar 2000 13:20:05 -0600 (CST), you wrote:
>James Sutherland <jas88@cam.ac.uk>:
>> On Mon, 20 Mar 2000 20:50:08 -0600, you wrote:
>> >On Mon, 20 Mar 2000, Horst von Brand wrote:
>> >>Jesse Pollard <pollard@tomcat.admin.navo.hpc.mil> said:
>> >>
>> >>[...]
>> >>
>> >>> Without overcommit:
>> >>> You can support 100 simultaneous connections, with full saturation of
>> >>> each server, with no failures.
>> >>>
>> >>> Result: happy customers, happy management, maybe even a raise.
>> >>
>> >>Management nagging about supporting more pages, worried by waste of several
>> >>Gb of disk that has never, ever been touched. System is sluggish, needs
>> >>constant tweaking of "resource allocation quotas" as applications crash,
>> >>even there are resources available. Seriously consider firing inept
>> >>sysadmin.
>> >
>> >Disk space is cheap.
>>
>> Still not cheap enough to throw away, unless you are Boeing.
>>
>> >System is no more sluggish than currently, and is much more reliable.
>>
>> It is slower, because of the extra overhead you have imposed on every
>> memory allocation. And it is no more reliable, except in OOM
>> situations - which your draconian resource quotas make MUCH more
>> frequent for the users.
>>
>> >First, "constant tweaking" means the system was/is undersized, and the
>> >accounting records justify the additional business support for expansion.
>>
>> Point to a system which has never passed 50% of VM in use, and your
>> request will be laughed out of purchasing.
>
>Cray YMP-8 - reached 100% several years ago. Replace by C90, C90 replaced
>by dual SV1 system. SV1 system at 80%, expanding to 4 nodes.
>SGI Power Challenge array - reached 90% usage in 1 year. Replaced by Origin
>2000. Origin 2000 (128 node) updated when load reached 80% (with several
>OOM crashes due to managements choice to overcommit)

What do you mean by "overcommit" in this case - the total online user
quota exceeding total VM, or the Linux (memory allocated on demand)
sense?

And are you seriously telling me that if I take an Irix box, disable
user VM quotas and do a malloc() loop, the system crashes?! (Mental
note: Replace all Irix boxes with NT)

> Added another 128 nodes,
>and a TB of disk (30-40GB used for swap.. still overcommitted, but not as
>severly). I have more.
>
>>
>> >Second, If it is determined that overcommitment is necessary, then it can
>> >still be done - along with the documentation on what happens when the system
>> >crashes.
>>
>> Overcommit does not cause system crashes; nor is it related in any way
>> to user resource quotas.
>
>Of course it is. If too many users are on one system trying to do too much
>at one time the system fails.

"The system fails"?? Either you have one broken system, or you have a
strange definition of "failure". The users should just get errors that
they don't have the resources available.

> Resource quotas force the users to schedule
>their work, and to cooperate with the system to accomplish the goals of
>business management.
>
>> >I've never been fired for guaranteeing system uptime.
>>
>> You should be, if you do so by disabling legitimate use of the system.
>
>I didn't - legitimate use was guaranteed. Crashing the system wasn't one
>of the legitimate uses. Nor was causing the failure of other users processes.

Crashing the system should not be possible under any circumstances,
regardless of resource quotas. If a user-mode program can crash the
system using only unprivileged syscalls, you have found a nasty bug.

>> >Obviously you don't work where uptime is counted in the thousands of dollars
>> >per hour.
>>
>> If users are unable to perform legitimate tasks due to a "lack of
>> resources", when the machine has ample resources to spare, I would
>> consider the system down. I would then fire the sysadmin who
>> configured it that inefficiently.
>
>Thats just it - the system was saturated.

Your quota settings would cause the system to begin denying resource
allocation BEFORE the system is saturated.

>> However, if you REALLY insist on setting the system up so people can
>> login and display all sorts of nice OOM error messages on an idle
>> system, go ahead. I'm not stopping you.
>
>If it were idle there would be no OOM error messages :)

Yes there would - you run out of YOUR memory, even when the system has
plenty to spare.

>> >>> If you overcommit:
>> >>> You might support 100 simultaneous connections, but not full saturation
>> >>> of each server, without aborting some connections or crashing the
>> >>> server/system.
>> >>>
>> >>> Result: some unhappy customers, not so happy management, difficulty
>> >>> in identifying problems, and definitely no raise.
>> >>
>> >>Other customers are unhappy because of long download times, NS (or
>> >>whatever) crashing, not getting to the site because of (local or remote)
>> >>network congestion. The "catastrophe" isn't even a blip on the sysadmin's
>> >>screen.
>> >
>> >Yup - long download times since the system keeps crashing, forcing the
>> >customer to a different site and or different company.
>>
>> No, the system never gets as far as crashing, because the inept
>> sysadmin has already disabled it.
>
>not worth answering.

Your approach, with a typical workload, would waste a large percentage
of the system's resources. Users rarely max out their quotas - but you
set the system to ensure that a user with a 128Mb quota has 128Mb
exclusively reserved for their use alone anyway?

>> >The catastrophe is loss of business, loss of reputation, going out of
>> >business because you can't keep the customers.
>>
>> The customers have already gone somewhere they can actually run
>> software without OOM errors.
>
>not your site - it keeps crashing.

Why? If I have the same level of resources as you, so the same
workload will succeed without any problems.

You still don't appear to understand what overcommit means in this
context.

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:13 EST