Re: 3.4-rc4 oom killer out of control.

From: Dave Jones
Date: Thu Apr 26 2012 - 16:53:22 EST


I rebooted, and reran the test, and within minutes, got it into a state
where it was killing things again fairly quickly.
This time however, it seems to have killed almost everything on the box,
but is still alive. The problem is that all the memory is eaten up by
something, and kswapd/ksmd is eating all the cpu.
(Attempting to profile with perf causes perf to be oom-killed).


# free
total used free shared buffers cached
Mem: 8149440 8046560 102880 0 272 3316
-/+ buffers/cache: 8042972 106468
Swap: 1023996 1023996 0

Attempting to flush the buffers with drop_caches made no difference.



A lot of VMAs in slab..

Active / Total Objects (% used) : 467327 / 494733 (94.5%)
Active / Total Slabs (% used) : 18195 / 18195 (100.0%)
Active / Total Caches (% used) : 145 / 207 (70.0%)
Active / Total Size (% used) : 241177.72K / 263399.54K (91.6%)
Minimum / Average / Maximum Object : 0.33K / 0.53K / 9.16K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
213216 213167 99% 0.49K 6663 32 106608K vm_area_struct
74718 74674 99% 0.37K 3558 21 28464K anon_vma_chain
37820 37663 99% 0.52K 1220 31 19520K anon_vma
33263 33188 99% 0.51K 1073 31 17168K kmalloc-192
25344 25340 99% 0.48K 768 33 12288K sysfs_dir_cache
12740 12660 99% 0.38K 637 20 5096K btrfs_free_space_cache
18981 9213 48% 0.58K 703 27 11248K dentry
8112 8075 99% 0.33K 338 24 2704K kmalloc-8
9982 6064 60% 1.34K 434 23 13888K inode_cache
5768 5605 97% 0.57K 206 28 3296K kmalloc-256
5640 5201 92% 0.38K 282 20 2256K kmalloc-64
4600 4577 99% 0.34K 200 23 1600K kmalloc-16
6182 4002 64% 0.36K 281 22 2248K debug_objects_cache
3002 2980 99% 1.62K 158 19 5056K mm_struct



ps axf shows there's hardly anything left running after all the oom killing:
http://fpaste.org/Dd9p/raw/

sysrq-t traces..

ksmd D ffff88021b81c940 5760 46 2 0x00000000
ffff88021b847b20 0000000000000046 0000000000000000 0000000000000000
ffff88021b81c940 ffff88021b847fd8 ffff88021b847fd8 ffff88021b847fd8
ffff8802232f24a0 ffff88021b81c940 ffff88021b847b50 ffff88021b847d20
Call Trace:
[<ffffffff816ace99>] schedule+0x29/0x70
[<ffffffff816aad75>] schedule_timeout+0x385/0x4f0
[<ffffffff816ac23f>] ? wait_for_common+0x3f/0x160
[<ffffffff816ae6da>] ? _raw_spin_unlock_irqrestore+0x4a/0x90
[<ffffffff810a591d>] ? try_to_wake_up+0x26d/0x310
[<ffffffff810a2981>] ? get_parent_ip+0x11/0x50
[<ffffffff816ac313>] wait_for_common+0x113/0x160
[<ffffffff810a59c0>] ? try_to_wake_up+0x310/0x310
[<ffffffff81166c30>] ? __pagevec_release+0x40/0x40
[<ffffffff816ac43d>] wait_for_completion+0x1d/0x20
[<ffffffff81086b76>] flush_work+0x46/0x70
[<ffffffff81085740>] ? do_work_for_cpu+0x30/0x30
[<ffffffff8108a37b>] schedule_on_each_cpu+0xeb/0x130
[<ffffffff81166c55>] lru_add_drain_all+0x15/0x20
[<ffffffff811a5e45>] ksm_scan_thread+0x8c5/0xd90
[<ffffffff81090040>] ? __init_waitqueue_head+0x60/0x60
[<ffffffff811a5580>] ? run_store+0x2b0/0x2b0
[<ffffffff8108f397>] kthread+0xb7/0xc0
[<ffffffff816b2b2d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff816b7f34>] kernel_thread_helper+0x4/0x10
[<ffffffff8109c08c>] ? finish_task_switch+0x7c/0x120
[<ffffffff816af034>] ? retint_restore_args+0x13/0x13
[<ffffffff8108f2e0>] ? __init_kthread_worker+0x70/0x70
[<ffffffff816b7f30>] ? gs_change+0x13/0x13

kswapd0 R running task 4152 45 2 0x00000008
ffff88021b833c80 0000000000000000 ffff88021b833b60 ffffffff8116d6df
ffff880200000000 ffff880200000001 0000000000000000 0000000000000002
0000000200000001 ffff88022ffebc40 ffff880226035598 000000000000001e
Call Trace:
[<ffffffff8116d6df>] ? shrink_inactive_list+0x17f/0x590
[<ffffffff8116e2d8>] ? shrink_mem_cgroup_zone+0x448/0x5d0
[<ffffffff8116e4d6>] ? shrink_zone+0x76/0xa0
[<ffffffff8116fd35>] ? balance_pgdat+0x555/0x7a0
[<ffffffff8117011c>] ? kswapd+0x19c/0x5f0
[<ffffffff81090040>] ? __init_waitqueue_head+0x60/0x60
[<ffffffff8116ff80>] ? balance_pgdat+0x7a0/0x7a0
[<ffffffff8108f397>] ? kthread+0xb7/0xc0
[<ffffffff816b2b2d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff816b7f34>] ? kernel_thread_helper+0x4/0x10
[<ffffffff8109c08c>] ? finish_task_switch+0x7c/0x120
[<ffffffff816af034>] ? retint_restore_args+0x13/0x13
[<ffffffff8108f2e0>] ? __init_kthread_worker+0x70/0x70
[<ffffffff816b7f30>] ? gs_change+0x13/0x13


khugepaged S 000000010039c99c 5728 47 2 0x00000000
ffff88021ba1bc60 0000000000000046 ffff88021ba1bc98 ffffffff81c28be0
ffff88021ba48000 ffff88021ba1bfd8 ffff88021ba1bfd8 ffff88021ba1bfd8
ffff8802232f24a0 ffff88021ba48000 ffff88021ba1bc50 ffff88021ba1bc98
Call Trace:
[<ffffffff816ace99>] schedule+0x29/0x70
[<ffffffff816aaba9>] schedule_timeout+0x1b9/0x4f0
[<ffffffff81076370>] ? lock_timer_base+0x70/0x70
[<ffffffff811b2233>] khugepaged+0x273/0x14a0
[<ffffffff8109c08c>] ? finish_task_switch+0x7c/0x120
[<ffffffff81090040>] ? __init_waitqueue_head+0x60/0x60
[<ffffffff811b1fc0>] ? khugepaged_alloc_sleep+0x160/0x160
[<ffffffff8108f397>] kthread+0xb7/0xc0
[<ffffffff816b2b2d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff816b7f34>] kernel_thread_helper+0x4/0x10
[<ffffffff8109c08c>] ? finish_task_switch+0x7c/0x120
[<ffffffff816af034>] ? retint_restore_args+0x13/0x13
[<ffffffff8108f2e0>] ? __init_kthread_worker+0x70/0x70
[<ffffffff816b7f30>] ? gs_change+0x13/0x13



sysrq-M ...

Mem-Info:
Node 0 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
CPU 2: hi: 0, btch: 1 usd: 0
CPU 3: hi: 0, btch: 1 usd: 0
Node 0 DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 27
CPU 1: hi: 186, btch: 31 usd: 0
CPU 2: hi: 186, btch: 31 usd: 24
CPU 3: hi: 186, btch: 31 usd: 1
Node 0 Normal per-cpu:
CPU 0: hi: 186, btch: 31 usd: 67
CPU 1: hi: 186, btch: 31 usd: 3
CPU 2: hi: 186, btch: 31 usd: 55
CPU 3: hi: 186, btch: 31 usd: 13
active_anon:1565586 inactive_anon:283198 isolated_anon:0
active_file:241 inactive_file:505 isolated_file:0
unevictable:1414 dirty:14 writeback:0 unstable:0
free:25817 slab_reclaimable:10704 slab_unreclaimable:56662
mapped:262 shmem:45 pagetables:45795 bounce:0
Node 0 DMA free:15876kB min:128kB low:160kB high:192kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15652kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:32kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 3246 8034 8034
Node 0 DMA32 free:47320kB min:27252kB low:34064kB high:40876kB active_anon:2651136kB inactive_anon:530560kB active_file:244kB inactive_file:232kB unevictable:788kB isolated(anon):0kB isolated(file):0kB present:3324200kB mlocked:788kB dirty:0kB writeback:0kB mapped:268kB shmem:68kB slab_reclaimable:1200kB slab_unreclaimable:24680kB kernel_stack:336kB pagetables:29924kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2 all_unreclaimable? no
lowmem_reserve[]: 0 0 4788 4788
Node 0 Normal free:40072kB min:40196kB low:50244kB high:60292kB active_anon:3611208kB inactive_anon:602232kB active_file:720kB inactive_file:1404kB unevictable:4868kB isolated(anon):0kB isolated(file):0kB present:4902912kB mlocked:4868kB dirty:56kB writeback:0kB mapped:780kB shmem:112kB slab_reclaimable:41616kB slab_unreclaimable:201936kB kernel_stack:2960kB pagetables:153256kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:123 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 1*4kB 0*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15876kB
Node 0 DMA32: 84*4kB 76*8kB 27*16kB 16*32kB 4*64kB 5*128kB 5*256kB 2*512kB 5*1024kB 2*2048kB 8*4096kB = 47072kB
Node 0 Normal: 64*4kB 171*8kB 279*16kB 384*32kB 135*64kB 54*128kB 18*256kB 3*512kB 0*1024kB 0*2048kB 0*4096kB = 40072kB
1774 total pagecache pages
1040 pages in swap cache
Swap cache stats: add 1966902, delete 1965862, find 483713/484292
Free swap = 0kB
Total swap = 1023996kB
2097136 pages RAM
59776 pages reserved
996409 pages shared
2007653 pages non-shared

sysrq-l traces show ksmd is doing *lots* of scanning..

Process ksmd (pid: 46, threadinfo ffff88021b846000, task ffff88021b81c940)
Stack:
ffff88021b847de0 ffffffff811a4881 ffff8800136a5260 ffffea0006236f00
ffff88021b847e90 ffffffff811a5f7d 0000000000000000 ffff88018ef64780
ffff88021b81c940 ffff88021b847fd8 ffff88021b847e50 ffff88018ef64828
Call Trace:
[<ffffffff811a4881>] memcmp_pages+0x61/0xc0
[<ffffffff811a5f7d>] ksm_scan_thread+0x9fd/0xd90
[<ffffffff81090040>] ? __init_waitqueue_head+0x60/0x60
[<ffffffff811a5580>] ? run_store+0x2b0/0x2b0
[<ffffffff8108f397>] kthread+0xb7/0xc0
[<ffffffff816b2b2d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff816b7f34>] kernel_thread_helper+0x4/0x10
[<ffffffff8109c08c>] ? finish_task_switch+0x7c/0x120
[<ffffffff816af034>] ? retint_restore_args+0x13/0x13
[<ffffffff8108f2e0>] ? __init_kthread_worker+0x70/0x70
[<ffffffff816b7f30>] ? gs_change+0x13/0x13


The machine is still alive, though somewhat crippled due to the oom-killer kicking in a lot,
but I can keep it up for additional debug info gathering..

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/