Re: [Patch v4 1/2] freezer: check OOM kill while being frozen

From: Michal Hocko
Date: Mon Sep 15 2014 - 05:36:15 EST


On Mon 15-09-14 05:34:36, Rafael J. Wysocki wrote:
> On Monday, September 15, 2014 09:56:57 AM Tejun Heo wrote:
> > On Sun, Sep 14, 2014 at 06:43:31PM +0200, Rafael J. Wysocki wrote:
> > > On Saturday, September 13, 2014 08:59:35 AM Tejun Heo wrote:
> > > > Doesn't this mean that if PM freezing and OOM killing race each other,
> > > > the system may hang? Driver PM operation may try to allocate memory
> > > > -> triggers OOM -> OOM killer selects an already frozen task ->
> > > > nothing happens. I wonder whether OOM killing and PM operations
> > > > should be mutually exclusive at a higher level. e.g. make OOM killing
> > > > always override freezing but let hibernation abort operation before
> > > > taking snapshot if OOM killing has happened since the beginning of the
> > > > PM operation.
> > >
> > > As Michal noted, we do oom_killer_disable() in freeze_processes(), so the
> > > scenario above cannot actually happen to my eyes. Or am I missing anything?
> >
> > Ah, okay, that's better but it doesn't seem enough. It does prevent
> > new invocations of the oom killer but doesn't do anything if oom
> > killing is already in progress. If we do block out oom killing
> > properly across PM freeze/thaw, it shoud be fine.
>
> OK, so my assumption was that oom_killer_disable() would wait for the OOM
> killing in progress to complete. Alternatively, it can return an error code
> if OOM killing is in progress and we can simply fail the freezing in that
> case.

You will need to check all the tasks again after oom_killer_disable.
Something like the following should work. I am not familiar with PM much
so I might have missed something. I didn't like direct do_each_thread loop
but there doesn't seem to be any helper and other callers are doing
something slightly different in the loop.

This patch builds on top of Cong Wang's. What do you think?
---