Re: 2.4.23-pre VM regression?

From: Andrea Arcangeli
Date: Thu Oct 16 2003 - 08:36:40 EST


On Thu, Oct 16, 2003 at 10:29:16AM -0200, Marcelo Tosatti wrote:
>
>
> On Thu, 16 Oct 2003, Andrea Arcangeli wrote:
>
> > On Thu, Oct 16, 2003 at 09:52:30AM -0200, Marcelo Tosatti wrote:
> > >
> > > Andrea,
> > >
> > > Martin first reported problems with "gzip -dc file | less" (280MB file).
> > > less was getting killed. He had no swap... I asked him to add some swap
> > > and it works now. Fine.
> > >
> > > The thing is that with 2.4.22 less was being killed, but with 2.4.23-pre
> > > he gets:
> >
> > note, that's a true oom, less needs to allocate 280MB and it doesn't fit
> > in ram. there's no bug as far as I can tell.
> >
> > a `vmstat 1` could confirm that.
> >
> > > >> And yes, the app was killed:
> > > > >
> > > > > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
> > > > > VM: killing process named
> > > > > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
> > > > > VM: killing process gpm
> > > > > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
> > > > > VM: killing process sendmail
> > > > > __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
> > > > > VM: killing process less
> >
> > here the vm keeps killing until 'less' - the real offender - is nuked.
> >
> > > So a lot of processes which should not get killed are dying. This is
> > > really bad. I was afraid it could happen and it did.
> > >
> > > What now? Resurrect OOM-killer?
> >
> > the oom killer has the problem I outlined some email ago, with shared
> > memory it gets fooled badly etc.., though in a desktop with all tiny
> > tasks except the memory-hog (`less` in this case) it works well.
>
> Andrea,
>
> There is no memory. Right. Some task has to be killed. But not small
> programs like sendmail/named/etc. What should be killed is "less". That is
> clear, right?

sure. I think I already explained there are downsides in disabling the
oom killer for desktops where the offender task is normally the biggest
one too, but those downsides aren't something I care about given the
cases it gets right w/o it (i.e. huge-shm-SGA/mlock/oomdeadlocks). the
oom killer can do the wrong decision too sometime, and more
systematically as well.

Andrea - If you prefer relying on open source software, check these links:
rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.[45]/
http://www.cobite.com/cvsps/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/