Re: [PATCH] oom: skip frozen tasks

From: David Rientjes
Date: Thu Aug 25 2011 - 17:14:28 EST


On Thu, 25 Aug 2011, Michal Hocko wrote:

> > > > That's obviously false since we call oom_killer_disable() in
> > > > freeze_processes() to disable the oom killer from ever being called in the
> > > > first place, so this is something you need to resolve with Rafael before
> > > > you cause more machines to panic.
> > >
> > > I didn't mean suspend/resume path (that is protected by oom_killer_disabled)
> > > so the patch doesn't make any change.
> >
> > Confused... freeze_processes() does try_to_freeze_tasks() before
> > oom_killer_disable() ?
>
> Yes you are right, I must have been blind.
>
> Now I see the point. We do not want to panic while we are suspending and
> the memory is really low just because all the userspace is already in
> the the fridge.
> Sorry for confusion.
>
> I still do not follow the oom_killer_disable note from David, though.
>

oom_killer_disable() was added to that path for a reason when all threads
are frozen: memory allocations still occur in the suspend path in an oom
condition and adding the oom_killer_disable() will cause those
allocations to fail rather than sending pointless SIGKILLs to frozen
threads.

Now consider if the only _eligible_ threads for oom kill (because of
cpusets or mempolicies) are those that are frozen. We certainly do not
want to panic because other cpusets are still getting work done. We'd
either want to add a mem to the cpuset or thaw the processes because the
cpuset is oom.

You can't just selectively skip certain threads when their state can be
temporary without risking a panic. That's why this patch is a
non-starter.

A much better solution would be to lower the badness score that the oom
killer uses for PF_FROZEN threads so that they aren't considered a
priority for kill unless there's nothing else left to kill.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/