Re: PROBLEM: oom killer and swap weirdness on 2.6.3* kernels

From: Hugh Dickins
Date: Fri May 21 2010 - 17:19:18 EST


On Thu, 20 May 2010, dave b wrote:

> Is there a reason - no one has taken any interesting in my email ?....
> The behaviour isn't found on the 2.6.26 debian kernel. So I was
> thinking that it might be due to my intel graphics card / memory
> interplay ? ....

It's nothing personal: the usual reason is that people are very busy.

>
> On 14 May 2010 23:14, dave b <db.pub.mail@xxxxxxxxx> wrote:
> > On 14 May 2010 22:53, dave b <db.pub.mail@xxxxxxxxx> wrote:
> >> In 2.6.3* kernels (test case was performed on the 2.6.33.3 kernel)
> >> when physical memory runs out and there is a large swap partition -
> >> the system completely stalls.
> >>
> >> I noticed that when running debian lenny using dm-crypt Âwith
> >> encrypted / and swap with a Â2.6.33.3 kernel (and all of the 2.6.3*
> >> series iirc) when all physical memory is used (swapiness was left at
> >> the default 60) the system hangs and does not respond. It can resume
> >> normal operation some time later - however it seems to take a *very*
> >> long time for the oom killer to come in. Obviously with swapoff this
> >> doesn't happen - the oom killer comes in and does its job.
> >>
> >>
> >> free -m
> >>       total    used    free   shared  Âbuffers   cached
> >> Mem: Â Â Â Â Â1980 Â Â Â 1101 Â Â Â Â879 Â Â Â Â Â0 Â Â Â Â 58 Â Â Â Â201
> >> -/+ buffers/cache: Â Â Â Â840 Â Â Â 1139
> >> Swap: Â Â Â Â24943 Â Â Â Â Â0 Â Â Â24943
> >>
> >>
> >> My simple test case is
> >>
> >> dd if=/dev/zero of=/tmp/stall
> >> and wait till /tmp fills...

Is that tmpfs sized the default 50% of RAM?
If you have sized it larger, then indeed filling it up might behave badly.

> >>
> >
> > Sorry - I forgot to say I am running x86-64

But I wonder if you're suffering from a bug which KOSAKI-San just
identified, and has very recently posted this patch: please try
it and let us all know - thanks.

Hugh

[PATCH] tmpfs: Insert tmpfs cache pages to inactive list at first

Shaohua Li reported parallel file copy on tmpfs can lead to
OOM killer. This is regression of caused by commit 9ff473b9a7
(vmscan: evict streaming IO first). Wow, It is 2 years old patch!

Currently, tmpfs file cache is inserted active list at first. It
mean the insertion doesn't only increase numbers of pages in anon LRU,
but also reduce anon scanning ratio. Therefore, vmscan will get totally
confusion. It scan almost only file LRU even though the system have
plenty unused tmpfs pages.

Historically, lru_cache_add_active_anon() was used by two reasons.
1) Intend to priotize shmem page rather than regular file cache.
2) Intend to avoid reclaim priority inversion of used once pages.

But we've lost both motivation because (1) Now we have separate
anon and file LRU list. then, to insert active list doesn't help
such priotize. (2) In past, one pte access bit will cause page
activation. then to insert inactive list with pte access bit mean
higher priority than to insert active list. Its priority inversion
may lead to uninteded lru chun. but it was already solved by commit
645747462 (vmscan: detect mapped file pages used only once).
(Thanks Hannes, you are great!)

Thus, now we can use lru_cache_add_anon() instead.

Reported-by: Shaohua Li <shaohua.li@xxxxxxxxx>
Cc: Wu Fengguang <fengguang.wu@xxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxx>
Cc: Minchan Kim <minchan.kim@xxxxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
---
mm/filemap.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index b941996..023ef61 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -452,7 +452,7 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
if (page_is_file_cache(page))
lru_cache_add_file(page);
else
- lru_cache_add_active_anon(page);
+ lru_cache_add_anon(page);
}
return ret;
}
--
1.6.5.2