Re: Please backport commit 3812c8c8f39 to stable

From: Michal Hocko
Date: Fri Oct 03 2014 - 11:37:57 EST

Next message: Rik van Riel: "Re: [PATCH RFC] sched,idle: teach select_idle_sibling about idle states"
Previous message: Shuah Khan: "[PATCH v3 3/5] selftests/ipc: change test to use ksft framework"
In reply to: Cong Wang: "Re: Please backport commit 3812c8c8f39 to stable"
Next in thread: Cong Wang: "Re: Please backport commit 3812c8c8f39 to stable"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu 02-10-14 14:04:08, Cong Wang wrote:
> Hello again,
>
> I realized it is a series of patch actually:
>
> 3812c8c8f3953921ef18544110dafc3505c1ac62 mm: memcg: do not trap
> chargers with full callstack on OOM
> fb2a6fc56be66c169f8b80e07ed999ba453a2db2 mm: memcg: rework and
> document OOM waiting and wakeup
> 519e52473ebe9db5cdef44670d5a97f1fd53d721 mm: memcg: enable memcg OOM
> killer only for user faults
> 3a13c4d761b4b979ba8767f42345fed3274991b0 x86: finish user fault error
> path with fatal signal
> 759496ba6407c6994d6a5ce3a5e74937d7816208 arch: mm: pass userspace
> fault flag to generic fault handler
> 871341023c771ad233620b7a1fb3d9c7031c4e5c arch: mm: do not invoke OOM
> killer on kernel fault OOM
> 94bce453c78996cc4373d5da6cfabe07fcc6d9f9 arch: mm: remove obsolete
> init OOM protection

Yes, that looks like the full series.

> I am not sure if they have more dependencies.
>
> However, this bug is *fairly* easy to reproduce on 3.10, just using the
> following script:
>
> #!/bin/bash
>
> TEST_DIR=/tmp/cgroup_test
> [ -d $TEST_DIR ] || mkdir -p $TEST_DIR
> mount -t cgroup none $TEST_DIR -o memory
> mkdir $TEST_DIR/test
> echo 512k > $TEST_DIR/test/memory.limit_in_bytes

This is just insane. You allow only 128 pages to be charged and the
reclaim will have to constantly wait for each page to finish the
writeback.

> dd if=/dev/zero of=/tmp/oom_test_big_file bs=512 count=20000000 &
> echo $! > $TEST_DIR/test/tasks
> rm -f /tmp/oom_test_big_file
> umount $TEST_DIR
>
>
> Run it like this:
>
> for i in `seq 1 1000`; do ./oom_hung.sh ; done

OK, so you will eventually deplete the limit by anon charges if the pid
makes it into the group sooner than dd allocates its 512B buffer (which
will end up consuming the full page anyway). So the OOM is pretty much
unavoidable. All the task will have minimum rss so then it is just a
matter of luck which one gets killed. But this alone shouldn't cause a
dead lock. Are you really sure this is the same issue discussed in the
mentioned patch?

> So please consider this seriously. :)

The bug is there since the memory controller has been introduced. Yet we
only had a single report happening in the real life. So I do not think
this is that urgent. It was definitely not a good design decision that
OOM killer was handled on top of unknown locks which might prevent from
forward progress. No question about that. Do you see the problem in the
real life somewhere because to be honest the test case is pretty much
insane.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Rik van Riel: "Re: [PATCH RFC] sched,idle: teach select_idle_sibling about idle states"
Previous message: Shuah Khan: "[PATCH v3 3/5] selftests/ipc: change test to use ksft framework"
In reply to: Cong Wang: "Re: Please backport commit 3812c8c8f39 to stable"
Next in thread: Cong Wang: "Re: Please backport commit 3812c8c8f39 to stable"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]