Re: Avoiding *mandatory* overcommit...

From: Jesse Pollard (pollard@cats-chateau.net)
Date: Fri Mar 31 2000 - 17:18:16 EST


On Fri, 31 Mar 2000, Marco Colombo wrote:
>On Thu, 30 Mar 2000, Linda Walsh wrote:
>
>[...]
>> Marco Colombo wrote:
>> > If you use plain malloc(), you're not allowed to think you have any
>> > space guaranteed. It's bad programming.
>> ---
>> ?! mlock locks pages in memory. I just want to malloc (from the
>> man page):
>>
>> malloc() allocates size bytes and returns a pointer to the
>> allocated memory. The memory is not cleared.
>> ...
>> For calloc() and malloc(), the value returned is a pointer
>> to the allocated memory, which is suitably aligned for any
>> kind of variable, or NULL if the request fails.
>>
>>
>> It's not bad programming to expect that malloc will allocate memory.
>> It's the documented interface. It is the documented interface to return
>> NULL if it cannot allocate the memory. With overcommit, the kernel has
>> broken this model because the memory isn't really allocated -- just the
>> process's top of heap pointer has been moved. My contention is that this
>> is not ANSI-C compliant.
>
>According to ANSI-C, for(;;); will run forever. Most believe the the
>Universe is finite in Time, so why don't you complain on the
>universe-hackers@vger.god. mailing list? B-)

nothing to do with memory management.

>Memory is just a concept, just as time, in the definition of a programming
>language. The OS maps (literaly) that concept to real "resources"
>(RAM, swap, ...). So "allocated memory" means *nothing* in the malloc()
>manual. The OS chooses to implement "memory" the way it likes. It
>can be just plain RAM in a single address space (unprotected memory), where
>malloc() allocates "system" memory. Or can be disk space, with RAM used
>only a as cache for recently used parts. Or a piece of VM (swap+RAM).

So, I take it that Linux doesn't really exist - it's just a concept....

>On OOM, you don't get any C error. The *program* does not fail in any way.
>It's the *process* that gets killed. The C standard know nothing about
>what a process is, and how a process interacts with the system.

Sure don't - you should get a ENOMEM for most things.

>In a UNIX-like enviroment, program bugs usually cause some system events
>on the process it is used to run the program. But we have
>to thank the UNIX design for this. Under DOS, program bugs (a ranaway
>pointer, for example) are more difficult to track. On the converse,
>it is not true that the system delivers signals to a process only because
>the program it runs has a bug. Silly example (I have already made):
>SIGTERM on shutdown, SIGHUP on control tty hangup, SIGINT for ^C,
>SIGTSTP on ^Z, or even SIGUSR[12], can be received by a process running
>a legal ANSI-C program, causing actions to be taken, without the ANSI
>standard even mentioning them. That's *UNIX programming*, not C programming.
>So the standard we should refer to (among others) is POSIX, not ANSI-C.
>And BTW, under Linux I program using cat | gas... who cares ANSI-C? B-)

This isn't relevent.

>Memory allocation in a C program is a completely different concept from
>memory allocation by a UNIX process. A process does not allocate memory
>at all. It just requests its address space to be extended. See brk()
>manual. On Solaris 2.5.1, the man page clearly states that space gets
>allocated. And, among possible errors "ENOMEM: Insufficient space exists
>in the swap area to support the expansion.", indicating that available
>swap (and not VM) is checked, BTW.

If the process address space is extended then the process expects to use
it.

>
>My RedHat Linux 6.1 brk() man page states just that:
> brk sets the end of the data segment to the value speci-
> fied by end_data_segment. end_datasegment must be greater
> than end of the text segment and it must be 16kB before
> the end of the stack.
>
>It says *nothing* about allocating space. The "non allocating" behaviour
>*is* documented. So, I have to say it again, if a program uses malloc()
>expecting the kernel to really allocate resources to it, it is *buggy*.
>It should use another interface. mlock() is one way to get real resources
>(RAM). I'm not saying that the interface it provides is enough for all
>your needs: but it should be clear that malloc() is NOT what you should
>use when you need real allocation.

Actually that sounds more like a documentation bug. If brk cannot allocate
memory then nothing can allocate memory, and no process can trust its' own
storage.

>
>> > If you need guaranteed "space"
>> > (memory) use another kernel interface, such as mlock(). I'm not saying
>> > the current interface is perfect. I'm just saying that overcommitting
>> > is not the problem. You don't need to turn overcommiting off. You
>> > need you use a better interface than malloc() to get "safe" memory.
>> ---
>> Not if we claim to be ANSI compliant.
>
>But i don't claim to be ANSI compliant: I'm Italian. B-)
>
>The *kernel* is not ANSI (C) compliant. It's a compiler issue, not OS.
>Maybe you mean POSIX?

The kernel is supposed to be POSIX. I believe the implementation of
brk in linux is not POSIX compliant, unless your statement is not
really true.

malloc is supposed to be ANSI; but if it cannot be used to allocate memory
then it isn't ANSI either.

BTW, the kernel support for brk contains:

        if (do_mmap(NULL, oldbrk, newbrk-oldbrk,
                   PROT_READ|PROT_WRITE|PROT_EXEC,
                   MAP_FIXED|MAP_PRIVATE, 0) != oldbrk)

The manpage also implies that memory can only be increased, but the kernel
code says it can be reduced too.

If this doesn't map pages into the process, what does? This certainly
looks like it allocates memory. Now that memory may have to be initialized
(ie, demand zero page fault), but this looks like a real allocation to me.

>> > For stack grow, maybe we need some way to tell the kernel:
>> > "never page-out my stack, and reserve me this space...".
>> ---
>> Paging out is not the issue. The issue is not having enough
>> combined memory and swap space. OOM doesn't simply mean out of physical
>> memory -- it means out of swap space as well. For this discussion most
>> people are using "memory" to mean "memory+swap".
>
>I know. But it think that mlock()ing stack pages could be easy to implement.
>And it gives you a way to write "secure" programs. In a "secure" program
>you should control stack grow anyway.

I wouldn't lock the stack - most of the entries on the stack are not going
to be used. Resident memory is more important than that. Page it out if
necessary.

>And, reading previous postings, now I know you can manage your own
>stack. This is even easier. Just set your stack up, mlock() *a few* pages,
>and write them to disk when you need more space. The only "active" part
>of a stack is the top, so it's very easy to manage a file image of it.
>
>> > Applications should be able to bypass kernel management of their address
>> > space. But this should be done on a per-app base.
>> ---
>> I agree with this statement, but it isn't relevant to the discussion
>> topic.
>
>Here I don't follow you. A per-application mm management is much better
>than playing with system wide setting (such as disabling overcommit).

Because the topic is the kernel, kernel resource management, and the kernel
interaction with processes - specificly the memory allocation to processes.

Per application mm management is userspace. Unless the kernel can supply
the resources, application managment is useless - there are no resources
to manage...
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@cats-chateau.net

Any opinions expressed are solely my own.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:30 EST