Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend

From: Tejun Heo
Date: Wed Nov 05 2014 - 08:02:58 EST


Hello, Michal.

On Wed, Nov 05, 2014 at 01:46:20PM +0100, Michal Hocko wrote:
> As I've said I wasn't entirely happy with this half solution but it helped
> the current situation at the time. The full solution would require to

I don't think this helps the situation. It just makes the bug more
obscure and the race window while reduced is still pretty big and
there seems to be an actual not too low chance of the bug triggering
out in the wild. How does this level of obscuring help anything? In
addition to making the bug more difficult to reproduce, it also adds a
bunch of code which *pretends* to address the issue but ultimately
just lowers visibility into what's going on and hinders tracking down
the issue when something actually goes wrong. This is *NOT* making
the situation better. The patch is net negative.

> I think the patch below should be safe. Would you prefer this solution
> instead? It is race free but there is the risk that exposing a lock which

Yes, this is an a lot saner approach in general.

> completely blocks OOM killer from the allocation path will kick us
> later.

Can you please spell it out? How would it kick us? We already have
oom_killer_disable/enable(), how is this any different in terms of
correctness from them? Also, why isn't this part of
oom_killer_disable/enable()? The way they're implemented is really
silly now. It just sets a flag and returns whether there's a
currently running instance or not. How were these even useful? Why
can't you just make disable/enable to what they were supposed to do
from the beginning?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/