Re: Avoiding OOM on overcommit...?

From: Marco Colombo (marco@esi.it)
Date: Tue Mar 28 2000 - 12:25:49 EST


On Tue, 28 Mar 2000, Peter T. Breuer wrote:

> "A month of sundays ago ptb wrote:"
> > "A month of sundays ago Marco Colombo wrote:"
> > > On Tue, 28 Mar 2000, Peter T. Breuer wrote:
> > >
> > > > > Why don't you mlock() a piece of RAM (you're going to know if it's available
> > > > > at mlock() time), so you never page-fault, and use it just a as cache
> > > >
> > > > This is a separate issue. I don't care about (heap) memory allocation. You
> > > > can always ensure you have memory at startup, or die. Malloc is not an
> > > > interesting or appropriate subject for the discussion. It's userlevel.
> > >
> > > Besides that mlock() is not "userlevel", now i understand you're focusing
> > > only on stack. Having a "reserved" swap for your stack it's the same of
> > > having a fixed stack. What happens if you run out of stack?
> >
> > No it's not the same. Firstly, if anyone runs out of stack, they're
> > dead so that is not a relevant topic to bring to the table. You can't
>
> Actually, now I come to think of it, I believe the BSD rules work:
>
> 0) the system starts with total swap+ram counting as "available
> virtual ram"
> 1) a "secure" process must have 8MB of "available virtual ram"
> before it can be started successfully (replace 8MB with rlimit, in
> general).
> 2) every time a secure process starts up it reduces the "available
> virtual ram" by 8MB.
> 3) a process can start up with less than 8MB available, or not reduce
> the available amount by 8MB, but in that case it will be marked
> "volatile", which means that it is a candidate for killing if the
> kernel needs more swap/ram.
>
> This ensures that a secure processes stack has somewhere to be paged out
> to, and that therefore the kernel need never kill it when trying to
> find swap space in which to put processes current stack pages.

That's simply a non overcommitting approach. And also a fixed-size approach.
Of course, it gives you a very predictable behaviour. The price
you pay is that system throughput is 1/10 of an overcommitting system
on the same HW. Is your shell 'secure'? And every 'ls' you perform?

Including RAM as a part of VM is a mistake. Consider a 64MB RAM +
64 MB swap system.

t1) Process A requests 60MB via brk(), the system has 128MB of VM,
    so it allocates 60MB to process A. Available VM is 68MB.
t2) Process B starts, and requests 60MB. The system has 68MB of VM,
    so it allocates 60MB to process B. Available VM is now 8MB.
t3) Process A writes all pages, so they are now in-core and dirty.
t4) Process B writes all pages. The kernel has now to page-out A
    pages to swap. About 60MB of swap gets used.
    B pages are in-core, and dirty.
t5) Process A writes all pages again. The kernel has to page-in A pages,
    so it has to page-out B before. But there are only 4MB of free swap.

OOS. You should check free *swap* space, not free VM.

I don't follow BSD any more, but at 4.3 time I'm pretty sure it allocated
*swap*. And it checked for available swap before allowing process creation
(or grow). In the example above, at t2) the kernel refuses to give 60MB
to process B. To run 2 60MB-sized processes you need >120MB of swap, no
matter how much RAM you have.

>
> Peter
>

.TM.

-- 
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@ESI.it

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:22 EST