Re: Silent hang up caused by pages being not scanned?

From: Tetsuo Handa
Date: Wed Oct 14 2015 - 10:38:13 EST


Michal Hocko wrote:
> The OOM report is really interesting:
>
> > [ 69.039152] Node 0 DMA32 free:74224kB min:44652kB low:55812kB high:66976kB active_anon:1334792kB inactive_anon:8240kB active_file:48364kB inactive_file:230752kB unevictable:0kB isolated(anon):92kB isolated(file):0kB present:2080640kB managed:1774264kB mlocked:0kB dirty:9328kB writeback:199060kB mapped:38140kB shmem:8472kB slab_reclaimable:17840kB slab_unreclaimable:16292kB kernel_stack:3840kB pagetables:7864kB unstable:0kB bounce:0kB free_pcp:784kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>
> so your whole file LRUs are either dirty or under writeback and
> reclaimable pages are below min wmark. This alone is quite suspicious.

I did

$ cat < /dev/zero > /tmp/log

for 10 seconds before starting

$ ./a.out

Thus, so much memory was waiting for writeback on XFS filesystem.

> Why hasn't balance_dirty_pages throttled writers and allowed them to
> make the whole LRU dirty? What is your dirty{_background}_{ratio,bytes}
> configuration on that system.

All values are defaults of plain CentOS 7 installation.

# sysctl -a | grep ^vm.
vm.admin_reserve_kbytes = 8192
vm.block_dump = 0
vm.compact_unevictable_allowed = 1
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 30
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200
vm.drop_caches = 0
vm.extfrag_threshold = 500
vm.hugepages_treat_as_movable = 0
vm.hugetlb_shm_group = 0
vm.laptop_mode = 0
vm.legacy_va_layout = 0
vm.lowmem_reserve_ratio = 256 256 32
vm.max_map_count = 65530
vm.memory_failure_early_kill = 0
vm.memory_failure_recovery = 1
vm.min_free_kbytes = 45056
vm.min_slab_ratio = 5
vm.min_unmapped_ratio = 1
vm.mmap_min_addr = 4096
vm.nr_hugepages = 0
vm.nr_hugepages_mempolicy = 0
vm.nr_overcommit_hugepages = 0
vm.nr_pdflush_threads = 0
vm.numa_zonelist_order = default
vm.oom_dump_tasks = 1
vm.oom_kill_allocating_task = 0
vm.overcommit_kbytes = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 50
vm.page-cluster = 3
vm.panic_on_oom = 0
vm.percpu_pagelist_fraction = 0
vm.stat_interval = 1
vm.swappiness = 30
vm.user_reserve_kbytes = 54808
vm.vfs_cache_pressure = 100
vm.zone_reclaim_mode = 0

>
> Also why throttle_vm_writeout haven't slown the reclaim down?

Too difficult question for me.

>
> Anyway this is exactly the case where zone_reclaimable helps us to
> prevent OOM because we are looping over the remaining LRU pages without
> making progress... This just shows how subtle all this is :/
>
> I have to think about this much more..

I'm suspicious about tweaking current reclaim logic.
Could you please respond to Linus's comments?

There are more moles than kernel developers can find. I think that
what we can do for short term is to prepare for moles that kernel
developers could not find, and for long term is to reform page
allocator for preventing moles from living.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/