OOM problems on 2.6.12-rc1 with many fsx tests

From: Mingming Cao
Date: Wed Mar 23 2005 - 14:58:10 EST


Andrea, Andrew,

I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on
2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10
hours the system hit OOM, and OOM keep killing processes one by one. I
could reproduce this problem very constantly on a 2 way PIII 700MHZ with
512MB RAM. Also the problem could be reproduced on running the same test
on reiser fs.

The fsx command is:

./fsx -c 10 -n -r 4096 -w 4096 /mnt/test/foo1 &

I also see fsx tests start to generating report about read bad data
about the tests have run for about 9 hours(one hour before of the OOM
happen).

I also logged the /proc/meminfo every 30 seconds from a remote machine.
I will attach the OOM message, the /proc/meminfo at the beginning of
the test and the end(when OOM killed the logging process) here.

When I run the tests I don't know this is a known issue and don't know
if 2.6.12-rc1 contains the proposed fix from Andrea/Andrew or not. If
you have something that I could try, let me know.

Thanks!

Mingming


============================================================
OOM messages on console
============================================================

Mar 23 02:16:16 elm3b92 kernel: oom-killer: gfp_mask=0x80d2
Mar 23 02:16:18 elm3b92 kernel: DMA per-cpu:
Mar 23 02:16:18 elm3b92 kernel: cpu 0 hot: low 2, high 6, batch 1
Mar 23 02:16:18 elm3b92 kernel: cpu 0 cold: low 0, high 2, batch 1
Mar 23 02:16:18 elm3b92 kernel: cpu 1 hot: low 2, high 6, batch 1
Mar 23 02:16:18 elm3b92 kernel: cpu 1 cold: low 0, high 2, batch 1
Mar 23 02:16:18 elm3b92 kernel: Normal per-cpu:
Mar 23 02:16:18 elm3b92 kernel: cpu 0 hot: low 32, high 96, batch 16
Mar 23 02:16:18 elm3b92 kernel: cpu 0 cold: low 0, high 32, batch 16
Mar 23 02:16:18 elm3b92 kernel: cpu 1 hot: low 32, high 96, batch 16
Mar 23 02:16:18 elm3b92 kernel: cpu 1 cold: low 0, high 32, batch 16
Mar 23 02:16:18 elm3b92 kernel: HighMem per-cpu: empty
Mar 23 02:16:19 elm3b92 kernel:
Mar 23 02:16:20 elm3b92 kernel: Free pages: 5208kB (0kB HighMem)
Mar 23 02:16:20 elm3b92 kernel: Active:60624 inactive:60838 dirty:0
writeback:0 unstable:0 free:1302 slab:3208 mapped:196 pagetables:105
Mar 23 02:16:20 elm3b92 kernel: DMA free:2072kB min:88kB low:108kB
high:132kB active:4776kB inactive:3908kB present:16384kB
pages_scanned:15150 all_unreclaimable? yes
Mar 23 02:16:20 elm3b92 kernel: lowmem_reserve[]: 0 496 496
Mar 23 02:16:20 elm3b92 kernel: Normal free:3136kB min:2804kB low:3504kB
high:4204kB active:237968kB inactive:239200kB present:507904kB
pages_scanned:15151 all_unreclaimable? no
Mar 23 02:16:20 elm3b92 kernel: lowmem_reserve[]: 0 0 0
Mar 23 02:16:20 elm3b92 kernel: HighMem free:0kB min:128kB low:160kB
high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Mar 23 02:16:20 elm3b92 kernel: lowmem_reserve[]: 0 0 0
Mar 23 02:16:20 elm3b92 kernel: DMA: 0*4kB 1*8kB 1*16kB 0*32kB 0*64kB
0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2072kB
Mar 23 02:16:20 elm3b92 kernel: Normal: 58*4kB 25*8kB 1*16kB 2*32kB
0*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3200kB
Mar 23 02:16:20 elm3b92 kernel: HighMem: empty
Mar 23 02:16:20 elm3b92 kernel: Swap cache: add 284169, delete 283991,
find 105376/203154, race 2+25
Mar 23 02:16:20 elm3b92 kernel: Free swap = 1042692kB
Mar 23 02:16:20 elm3b92 kernel: Total swap = 1052216kB
Mar 23 02:16:20 elm3b92 kernel: Out of Memory: Killed process 2115
(cupsd).
Mar 23 02:21:01 elm3b92 kernel: oom-killer: gfp_mask=0x1d2
Mar 23 02:21:03 elm3b92 kernel: DMA per-cpu:
Mar 23 02:21:03 elm3b92 kernel: cpu 0 hot: low 2, high 6, batch 1
Mar 23 02:21:03 elm3b92 kernel: cpu 0 cold: low 0, high 2, batch 1
Mar 23 02:21:03 elm3b92 kernel: cpu 1 hot: low 2, high 6, batch 1
Mar 23 02:21:03 elm3b92 kernel: cpu 1 cold: low 0, high 2, batch 1
Mar 23 02:21:03 elm3b92 kernel: Normal per-cpu:
Mar 23 02:21:03 elm3b92 kernel: cpu 0 hot: low 32, high 96, batch 16
Mar 23 02:21:03 elm3b92 kernel: cpu 0 cold: low 0, high 32, batch 16
Mar 23 02:21:03 elm3b92 kernel: cpu 1 hot: low 32, high 96, batch 16
Mar 23 02:21:03 elm3b92 kernel: cpu 1 cold: low 0, high 32, batch 16
Mar 23 02:21:03 elm3b92 kernel: HighMem per-cpu: empty
Mar 23 02:21:04 elm3b92 kernel:
Mar 23 02:21:08 elm3b92 kernel: Free pages: 4872kB (0kB HighMem)
Mar 23 02:21:10 elm3b92 kernel: Active:61061 inactive:60495 dirty:0
writeback:30 unstable:0 free:1218 slab:3159 mapped:272 pagetables:86
Mar 23 02:21:11 elm3b92 kernel: DMA free:2072kB min:88kB low:108kB
high:132kB active:4832kB inactive:3852kB present:16384kB
pages_scanned:18465 all_unreclaimable? yes
Mar 23 02:21:11 elm3b92 kernel: lowmem_reserve[]: 0 496 496
Mar 23 02:21:11 elm3b92 kernel: Normal free:2800kB min:2804kB low:3504kB
high:4204kB active:239492kB inactive:237988kB present:507904kB
pages_scanned:107346 all_unreclaimable? no
Mar 23 02:21:11 elm3b92 kernel: lowmem_reserve[]: 0 0 0
Mar 23 02:21:11 elm3b92 kernel: DMA: 0*4kB 1*8kB 1*16kB 0*32kB 0*64kB
0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2072kB
Mar 23 02:21:11 elm3b92 kernel: Normal: 2*4kB 7*8kB 3*16kB 2*32kB 0*64kB
1*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 2864kB
Mar 23 02:21:11 elm3b92 kernel: HighMem: empty
Mar 23 02:21:11 elm3b92 kernel: Swap cache: add 292112, delete 291918,
find 107914/208530, race 2+28
Mar 23 02:21:11 elm3b92 kernel: Free swap = 1044992kB
Mar 23 02:21:11 elm3b92 kernel: Total swap = 1052216kB
Mar 23 02:21:11 elm3b92 kernel: Out of Memory: Killed process 5638
(nscd).
Mar 23 02:21:19 elm3b92 kernel: oom-killer: gfp_mask=0x1d2
Mar 23 02:21:21 elm3b92 kernel: DMA per-cpu:
Mar 23 02:21:21 elm3b92 kernel: cpu 0 hot: low 2, high 6, batch 1
Mar 23 02:21:21 elm3b92 kernel: cpu 0 cold: low 0, high 2, batch 1
Mar 23 02:21:21 elm3b92 kernel: cpu 1 hot: low 2, high 6, batch 1
Mar 23 02:21:21 elm3b92 kernel: cpu 1 cold: low 0, high 2, batch 1
Mar 23 02:21:21 elm3b92 kernel: Normal per-cpu:
Mar 23 02:21:21 elm3b92 kernel: cpu 0 hot: low 32, high 96, batch 16
Mar 23 02:21:21 elm3b92 kernel: cpu 0 cold: low 0, high 32, batch 16
Mar 23 02:21:21 elm3b92 kernel: cpu 1 hot: low 32, high 96, batch 16
Mar 23 02:21:21 elm3b92 kernel: cpu 1 cold: low 0, high 32, batch 16
Mar 23 02:21:21 elm3b92 kernel: HighMem per-cpu: empty
Mar 23 02:21:21 elm3b92 kernel:
Mar 23 02:21:21 elm3b92 kernel: Free pages: 5296kB (0kB HighMem)
Mar 23 02:21:21 elm3b92 kernel: Active:60775 inactive:60649 dirty:28
writeback:0 unstable:0 free:1324 slab:3157 mapped:220 pagetables:81
Mar 23 02:21:21 elm3b92 kernel: DMA free:2072kB min:88kB low:108kB
high:132kB active:4864kB inactive:3820kB present:16384kB
pages_scanned:18855 all_unreclaimable? yes
Mar 23 02:21:21 elm3b92 kernel: lowmem_reserve[]: 0 496 496
Mar 23 02:21:21 elm3b92 kernel: Normal free:3224kB min:2804kB low:3504kB
high:4204kB active:238080kB inactive:238776kB present:507904kB
pages_scanned:52439 all_unreclaimable? no
Mar 23 02:21:21 elm3b92 kernel: lowmem_reserve[]: 0 0 0
Mar 23 02:21:21 elm3b92 kernel: HighMem free:0kB min:128kB low:160kB
high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Mar 23 02:21:21 elm3b92 kernel: lowmem_reserve[]: 0 0 0
Mar 23 02:21:21 elm3b92 kernel: DMA: 0*4kB 1*8kB 1*16kB 0*32kB 0*64kB
0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2072kB
Mar 23 02:21:21 elm3b92 kernel: Normal: 96*4kB 19*8kB 0*16kB 2*32kB
0*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3288kB
Mar 23 02:21:21 elm3b92 kernel: HighMem: empty
Mar 23 02:21:21 elm3b92 kernel: Swap cache: add 293172, delete 292990,
find 108239/209189, race 2+31
Mar 23 02:21:21 elm3b92 kernel: Free swap = 1045004kB
Mar 23 02:21:21 elm3b92 kernel: Total swap = 1052216kB
Mar 23 02:21:21 elm3b92 kernel: Out of Memory: Killed process 2062
(portmap).



=======================================================
/proc/meminfo at the beginning of the fsx tests
=======================================================
+ cat /proc/meminfo
MemTotal: 510400 kB
MemFree: 397284 kB
Buffers: 23788 kB
Cached: 35712 kB
SwapCached: 0 kB
Active: 68592 kB
Inactive: 21312 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 510400 kB
LowFree: 397284 kB
SwapTotal: 1052216 kB
SwapFree: 1052216 kB
Dirty: 2312 kB
Writeback: 0 kB
Mapped: 23500 kB
Slab: 16412 kB
CommitLimit: 1307416 kB
Committed_AS: 74804 kB
PageTables: 572 kB
VmallocTotal: 516024 kB
VmallocUsed: 3700 kB
VmallocChunk: 512320 kB
+ sleep 30

===========================================
/proc/meminfo at the end of the log file
===========================================
+ cat /proc/meminfo
MemTotal: 510400 kB
MemFree: 5512 kB
Buffers: 500 kB
Cached: 584 kB
SwapCached: 344 kB
Active: 243580 kB
Inactive: 242248 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 510400 kB
LowFree: 5512 kB
SwapTotal: 1052216 kB
SwapFree: 1045984 kB
Dirty: 0 kB
Writeback: 0 kB
Mapped: 804 kB
Slab: 12584 kB
CommitLimit: 1307416 kB
Committed_AS: 13560 kB
PageTables: 284 kB
VmallocTotal: 516024 kB
VmallocUsed: 3700 kB
VmallocChunk: 512320 kB
+ sleep 30




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/