On Sat, Mar 05, 2011 at 06:34:37PM +0300, Andrew Vagin wrote:balance_pgdat set zone->all_unreclaimable, but the problem is that it is cleaned late.On 03/05/2011 06:20 PM, Minchan Kim wrote:Sorry if I confused you.On Sat, Mar 05, 2011 at 02:44:16PM +0300, Andrey Vagin wrote:I sent one more patch [PATCH] mm: skip zombie in OOM-killer.Check zone->all_unreclaimable in all_unreclaimable(), otherwise thezone_reclaimable checks it. Isn't it enough?
kernel may hang up, because shrink_zones() will do nothing, but
all_unreclaimable() will say, that zone has reclaimable pages.
do_try_to_free_pages()
shrink_zones()
for_each_zone
if (zone->all_unreclaimable)
continue
if !all_unreclaimable(zonelist, sc)
return 1
__alloc_pages_slowpath()
retry:
did_some_progress = do_try_to_free_pages(page)
...
if (!page&& did_some_progress)
retry;
Signed-off-by: Andrey Vagin<avagin@xxxxxxxxxx>
---
mm/vmscan.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6771ea7..1c056f7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2002,6 +2002,8 @@ static bool all_unreclaimable(struct zonelist *zonelist,
for_each_zone_zonelist_nodemask(zone, z, zonelist,
gfp_zone(sc->gfp_mask), sc->nodemask) {
+ if (zone->all_unreclaimable)
+ continue;
if (!populated_zone(zone))
continue;
if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
This two patches are enough.
I mean zone->all_unreclaimable become true if !zone_reclaimable in balance_pgdat.
zone_reclaimable compares recent pages_scanned with the number of zone lru pages.
So too many page scanning in small lru pages makes the zone to unreclaimable zone.
In all_unreclaimable, we calls zone_reclaimable to detect it.
It's the same thing with your patch.
Yes.It seems the test program makes fork bombs and memory hogging.Does the hang up really happen or see it by code review?Yes. You can reproduce it for help the attached python program. It's
not very clever:)
It make the following actions in loop:
1. fork
2. mmap
3. touch memory
4. read memory
5. munmmap
If you applied this patch, the problem is gone?
import sys, time, mmap, os--
1.7.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email:<a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx</a>
from subprocess import Popen, PIPE
import random
global mem_size
def info(msg):
pid = os.getpid()
print>> sys.stderr, "%s: %s" % (pid, msg)
sys.stderr.flush()
def memory_loop(cmd = "a"):
"""
cmd may be:
c: check memory
else: touch memory
"""
c = 0
for j in xrange(0, mem_size):
if cmd == "c":
if f[j<<12] != chr(j % 255):
info("Data corruption")
sys.exit(1)
else:
f[j<<12] = chr(j % 255)
while True:
pid = os.fork()
if (pid != 0):
mem_size = random.randint(0, 56 * 4096)
f = mmap.mmap(-1, mem_size<< 12, mmap.MAP_ANONYMOUS|mmap.MAP_PRIVATE)
memory_loop()
memory_loop("c")
f.close()