Re: [PATCH] oom killer (Core)

From: Andrea Arcangeli
Date: Thu Dec 02 2004 - 11:49:08 EST


On Thu, Dec 02, 2004 at 02:48:00PM +0100, Thomas Gleixner wrote:
> Reentrancy and follow up calls of oom_kill() are blocked until the task
> which was killed by the first oom_kill() has actually released the
> resources. I added a callback which is called from do_exit() when the
> PF_MEMDIE flag is set. The reentrancy blocking is released inside the
> callback.

This logic will certainly fix it, and I'm not against this logic to fix
the problem in a definitive way. It's only unfortunate PF_MEMDIE is smp
racy (plus having to check for PF_MEMDIE in the fast path).

> >From the first call to out_of_memory, which activates the reentrancy
> blocking until the blocking is released in the callback, out_of_memory
> is called more than 300 times.

I believe the thing you're hiding with the callback, is some screwup in
the VM. It shouldn't fire oom 300 times in a row.

zone->all_unreclaimable and zone->pages_scanned _must_ be set to 0 by
the oom_killer invocation, did you try to fix that?

> So it's up to you VM guys to fight out from which place you want call
> out_of_memory(). I don't care as both places have exactly the same bad
> side effects.

the reason for oom in page_alloc.c, is that you must check the free
pages levels correctly, your previous nr_free_pages() attempt was the
unreliable way to do that (it couldn't take into account all the
lowmem_reserve levels etc..).

> My concern is to make it reliable when it is finally invoked.

This approach of using a callback will sure work fine for that but the
300 times invocation of oom kill shows something is wrong in the VM, and
I believe the screwup is zone->all_unreclaimable and zone->pages_scanned
not being resetted by the oom killer invocation. I suspect if you fix
that single bit, things will start working even without the callback.

Note that in 2.4 only one task gets killed, and there's no need of any
callback in any fast path to make it work. I'm not conceptually against
the callback to "guarantee" oom-killing racing avoidance, but right now
it sounds like it's hiding some more fundamental problem in 2.6.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/