Re: Found the commit that causes the OOMs

From: Wu Fengguang
Date: Tue Jun 30 2009 - 22:30:41 EST


On Wed, Jul 01, 2009 at 12:50:42AM +0900, Minchan Kim wrote:
> On Tue, Jun 30, 2009 at 11:05 PM, Wu Fengguang<fengguang.wu@xxxxxxxxx> wrote:
> >
> > More data: I boot 2.6.30-rc1 with mem=1G and enabled 1GB swap and run msgctl11.
> >
> > It goes OOM at the 2nd run. They are very interesting numbers: memory leaked?
>
> Hmm. It's very serious and another problem since this system have swap
> device and it's not full.

Yes.

> Can you reproduce it easily ?

Not always. It runs OK in the first run (after fresh boot).
At the second run, it may OOM, or lockup (dmesg in another email).

> I want to reproduce it in my system.
>
> Did you ran only msgctl11 not all LTP test ?
> Just default parameter ? ex) $ ./testcases/bin/msgctl11

Yes, I run it standalone with no parameters.

> 2nd run ? You mean you execute msgctl11 two time in order ?
> I mean after first test is finished successfully and OOM happens
> second test before ending successfully ?

Yes, to run it two times after fresh boot.
Because the first run seem to always succeed.

Thanks,
Fengguang

>
> > Â Â Â Â[ 2259.825958] msgctl11 invoked oom-killer: gfp_mask=0x84d0, order=0, oom_adj=0
> > Â Â Â Â[ 2259.828092] Pid: 29657, comm: msgctl11 Not tainted 2.6.31-rc1 #22
> > Â Â Â Â[ 2259.830505] Call Trace:
> > Â Â Â Â[ 2259.832010] Â[<ffffffff8156f366>] ? _spin_unlock+0x26/0x30
> > Â Â Â Â[ 2259.834219] Â[<ffffffff810c8b26>] oom_kill_process+0x176/0x270
> > Â Â Â Â[ 2259.837603] Â[<ffffffff810c8def>] ? badness+0x18f/0x300
> > Â Â Â Â[ 2259.839906] Â[<ffffffff810c9095>] __out_of_memory+0x135/0x170
> > Â Â Â Â[ 2259.842035] Â[<ffffffff810c91c5>] out_of_memory+0xf5/0x180
> > Â Â Â Â[ 2259.844270] Â[<ffffffff810cd86c>] __alloc_pages_nodemask+0x6ac/0x6c0
> > Â Â Â Â[ 2259.846743] Â[<ffffffff810f8fa8>] alloc_pages_current+0x78/0x100
> > Â Â Â Â[ 2259.849083] Â[<ffffffff81033515>] pte_alloc_one+0x15/0x50
> > Â Â Â Â[ 2259.851282] Â[<ffffffff810e0eda>] __pte_alloc+0x2a/0xf0
> > Â Â Â Â[ 2259.853454] Â[<ffffffff810e16e2>] handle_mm_fault+0x742/0x830
> > Â Â Â Â[ 2259.855793] Â[<ffffffff815725cb>] do_page_fault+0x1cb/0x330
> > Â Â Â Â[ 2259.858033] Â[<ffffffff8156fdf5>] page_fault+0x25/0x30
> > Â Â Â Â[ 2259.860301] Mem-Info:
> > Â Â Â Â[ 2259.861706] Node 0 DMA per-cpu:
> > Â Â Â Â[ 2259.862523] CPU Â Â0: hi: Â Â0, btch: Â 1 usd: Â 0
> > Â Â Â Â[ 2259.864454] CPU Â Â1: hi: Â Â0, btch: Â 1 usd: Â 0
> > Â Â Â Â[ 2259.866608] Node 0 DMA32 per-cpu:
> > Â Â Â Â[ 2259.867404] CPU Â Â0: hi: Â186, btch: Â31 usd: 197
> > Â Â Â Â[ 2259.869283] CPU Â Â1: hi: Â186, btch: Â31 usd: 175
> > Â Â Â Â[ 2259.870511] Active_anon:0 active_file:11 inactive_anon:0
> >
> > zero anon pages!
> >
> > Â Â Â Â[ 2259.870512] Âinactive_file:0 unevictable:0 dirty:0 writeback:0 unstable:0
> > Â Â Â Â[ 2259.870513] Âfree:1986 slab:42170 mapped:96 pagetables:59427 bounce:0
> > Â Â Â Â[ 2259.877722] Node 0 DMA free:3976kB min:56kB low:68kB high:84kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB present:15164kB pages_scanned:429 all_unreclaimable? no
> > Â Â Â Â[ 2259.883804] lowmem_reserve[]: 0 982 982 982
> > Â Â Â Â[ 2259.885814] Node 0 DMA32 free:3968kB min:3980kB low:4972kB high:5968kB active_anon:0kB inactive_anon:0kB active_file:44kB inactive_file:0kB unevictable:0kB present:1005984kB pages_scanned:152 all_unreclaimable? no
> > Â Â Â Â[ 2259.890958] lowmem_reserve[]: 0 0 0 0
> > Â Â Â Â[ 2259.893183] Node 0 DMA: 4*4kB 3*8kB 2*16kB 0*32kB 1*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3976kB
> > Â Â Â Â[ 2259.897406] Node 0 DMA32: 334*4kB 77*8kB 24*16kB 27*32kB 10*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3968kB
> > Â Â Â Â[ 2259.902753] 625 total pagecache pages
> > Â Â Â Â[ 2259.903623] 454 pages in swap cache
> > Â Â Â Â[ 2259.905299] Swap cache stats: add 95129, delete 94675, find 55783/67607
> > Â Â Â Â[ 2259.908858] Free swap Â= 1041232kB
> > Â Â Â Â[ 2259.909618] Total swap = 1048568kB
> >
> > swap far from full!
> >
> > Â Â Â Â[ 2259.919456] 262144 pages RAM
> > Â Â Â Â[ 2259.921071] 12513 pages reserved
> > Â Â Â Â[ 2259.922790] 314212 pages shared
> > Â Â Â Â[ 2259.923548] 165757 pages non-shared
> > Â Â Â Â[ 2259.925234] Out of memory: kill process 20791 (msgctl11) score 2280094 or a child
> > Â Â Â Â[ 2259.928982] Killed process 21946 (msgctl11)
> >
> >
>
>
>
> --
> Kinds regards,
> Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/