[PATCH -v3 0/5] OOM vs PM freezer fixes

From: Michal Hocko
Date: Fri Jan 09 2015 - 06:06:16 EST


Hi,
this is an updated version of the patchset previous posted here:
http://marc.info/?l=linux-mm&m=141779771518056&w=2.
Changes since then are:
- cleanups, doc and function renames as per Tejun
- __thaw_task moved to mark_tsk_oom_victim and frozen() check removed
as it would be racy and it is not necessary anyway - per Tejun
- obvious typo in wait_event condition
- oom_killer_enable moved to thaw_processes before user tasks are thawed
rather than thaw_kernel_threads which is even not called from s2ram resume
path - per Tejun
- oom_killer_disable moved to freeze_processes to be more in sync with
the enable.

I have tested the series in KVM with 100M RAM:
- many small tasks (20M anon mmap) which are triggering OOM continually
- s2ram which resumes automatically is triggered in a loop
echo processors > /sys/power/pm_test
while true
do
echo mem > /sys/power/state
sleep 1s
done
- simple module which allocates and frees 20M in 8K chunks. If it sees
freezing(current) then it tries another round of allocation before calling
try_to_freeze
- debugging messages of PM stages and OOM killer enable/disable/fail added
and unmark_oom_victim is delayed by 1s after it clears TIF_MEMDIE and before
it wakes up waiters.
- rebased on top of the current mmotm which means some necessary updates
in mm/oom_kill.c. mark_tsk_oom_victim is now called under task_lock but
I think this should be OK because __thaw_task shouldn't interfere with any
locking down wake_up_process. Oleg?

As expected there are no OOM killed tasks after oom is disabled and
allocations requested by the kernel thread are failing after all the
tasks are frozen and OOM disabled. I wasn't able to catch a race where
oom_killer_disable would really have to wait but I kinda expected the
race is really unlikely.

[ 242.609330] Killed process 2992 (mem_eater) total-vm:24412kB, anon-rss:2164kB, file-rss:4kB
[ 243.628071] Unmarking 2992 OOM victim. oom_victims: 1
[ 243.636072] (elapsed 2.837 seconds) done.
[ 243.641985] Trying to disable OOM killer
[ 243.643032] Waiting for concurent OOM victims
[ 243.644342] OOM killer disabled
[ 243.645447] Freezing remaining freezable tasks ... (elapsed 0.005 seconds) done.
[ 243.652983] Suspending console(s) (use no_console_suspend to debug)
[ 243.903299] kmem_eater: page allocation failure: order:1, mode:0x204010
[...]
[ 243.992600] PM: suspend of devices complete after 336.667 msecs
[ 243.993264] PM: late suspend of devices complete after 0.660 msecs
[ 243.994713] PM: noirq suspend of devices complete after 1.446 msecs
[ 243.994717] ACPI: Preparing to enter system sleep state S3
[ 243.994795] PM: Saving platform NVS memory
[ 243.994796] Disabling non-boot CPUs ...

The first 2 patches are simple cleanups for OOM. They should go in
regardless the rest IMO.
Patches 3 and 4 are trivial printk -> pr_info conversion and they should
go in ditto.
The main patch is the last one and I would appreciate acks from Tejun
and Rafael. I think the OOM part should be OK (except for __thaw_task
vs. task_lock where a look from Oleg would appreciated) but I am not
so sure I haven't screwed anything in the freezer code. I have found
several surprises there.

The patchset is based on the current mmotm tree (mmotm-2015-01-07-17-07).
I think it make more sense if it is routed via Andrew due to dependences on
other OOM killer patches.

Shortlog says:
Michal Hocko (5):
oom: add helpers for setting and clearing TIF_MEMDIE
oom: thaw the OOM victim if it is frozen
PM: convert printk to pr_* equivalent
sysrq: convert printk to pr_* equivalent
oom, PM: make OOM detection in the freezer path raceless

And diffstat:
drivers/staging/android/lowmemorykiller.c | 7 +-
drivers/tty/sysrq.c | 23 ++---
include/linux/oom.h | 18 ++--
kernel/exit.c | 3 +-
kernel/power/process.c | 76 +++++----------
mm/memcontrol.c | 4 +-
mm/oom_kill.c | 149 ++++++++++++++++++++++++++----
mm/page_alloc.c | 17 +---
8 files changed, 185 insertions(+), 112 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/