Re: Memory management in Kernel 2.4.x

From: Andrea Arcangeli (andrea@suse.de)
Date: Sat Jun 01 2002 - 17:59:50 EST


On Sat, Jun 01, 2002 at 02:35:20PM +0200, Andreas Hartmann wrote:
> Hello Andrea,
>
> Andrea Arcangeli wrote:
> >On Mon, May 27, 2002 at 03:45:55PM +0200, Peter Wächtler wrote:
> >
> >>Andreas Hartmann wrote:
> >>
> >>>Zwane Mwaikambo wrote:
> >>>
> >>>
> >>>
> >>>>On Mon, 27 May 2002, Andreas Hartmann wrote:
> >>>>
> >>>>
> >>>>
> >>>>>rsync allocates all of the memory the machine has (256 MB RAM, 128 MB
> >>>>>swap). When this occures, processes get killed like described in the
> >>>>>posting before. The machine doesn't respond as long as the rsync -
> >>>>>process isn't killed, because it fetches all the memory which gets free
> >>>>>after a process has been killed.
> >>>>>
> >>>>
> >>>>And the rsync process never gets singled out? nice!
> >>>>
> >>>
> >>>Until it's killed by the kernel (if overcommitment isn't deactivated).
> >>>If overcommitment is deactivated, the services of the machine are dead
> >>>forever. There will be nothing, which kills such a process. Or am I
> >>>wrong?
> >>>
> >>
> >>There is still the oom killer (Out Of Memory).
> >>But it doesn't trigger and the machine pages "forever".
> >>Usually kswapd eats the CPU then, discarding and reloading pages,
> >>searching lists for pages to evict and so on.
> >
> >
> >can you reproduce with 2.4.19pre9aa2? I expect at least the deadlock
> >(if it's a deadlock and not a livelock) should go away.
> >
> >Also I read in another email that somebody grown the per-socket buffer
> >to hundred mbytes, if you do that on a 32bit arch with highmem you'll
> >run into troubles, too much ZONE_NORMAL ram will be constantly pinned
> >for the tcp pipeline and the machine can enter livelocks.
>
> I tested it and the error-situation occured again (thanks to rsync :-)).
>
> How did the kernel react?
> Well, I waited some seconds until all the memory was in use (256 MB RAM
> and 128 MB swap). I have to say, nearly all the memory, because there
> was allways some MB's free in RAM. So I was able to login via ssh as
> root. As I wanted to do something, the machine didn't react. Until this
> point, the machine seemed to work fine - xosview through ssh showed the
> activity very well. But then it crashed, because it couldn't open
> /proc/stat:
> Can not open file : /proc/stat
> Some seconds later, the machine was working normal again. What happened?
> /var/log/messages says:
>
> Jun 1 14:03:46 susi squid[271]: Squid Parent: child process 273 exited
> due to signal 9
> Jun 1 14:03:48 susi kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x1d2/0)
> Jun 1 14:03:48 susi kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x1d2/0)
> Jun 1 14:03:48 susi kernel: VM: killing process squid
> Jun 1 14:04:05 susi kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x1f0/0)
> Jun 1 14:04:07 susi kernel: __alloc_pages: 0-order allocation failed
> (gfp=0xf0/0)
> Jun 1 14:04:11 susi kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x1f0/0)
> Jun 1 14:04:25 susi kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x1d2/0)
> Jun 1 14:04:25 susi kernel: VM: killing process rsync
>
>
> The kernel killed the maliciuos rsync-process.
> Not nice was, that the kernel killed squid, too. If it didn't kill squid
> it would have been a very good result. On the other hand, squid had

squid or any other task could as well have more memory allocated than
rsync, so the oom killer heuristic could get wrong it too. There's
simply no way that the kernel will be able to do the right choice always
without user intervention, no-way. Adding the allocation rate of tasks
would be a good start, much better than the current oom killer heuristic
that will fail on any real non-desktop server usage with 90% of virtual
memory always allocated in non malicious tasks.

> already problems - so it was probably a good idea to kill it completely
> to make room for other running programs.
>
> I would like to know, if the killing of rsync was pure chance or if it
> was systematically. I will try it again :-).

it was systemaltically. If will eventually be killed as you expect. The
only problem is that something else may be killed too in the meantime,
but the kernel can always do the wrong choice anyways, and being
guaranteed of no-deadlock is much better than the current status of
mainline.

we had some discussion with Andrew on l-k last month on how to be safe
and to still allow an heuristic to make a first-choice based on some
statistic, but that's low prio for me. no matter what we do the kernel
can always do the wrong thing and have to kill task without the
userspace being aware of the error condition.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jun 07 2002 - 22:00:11 EST