Re: [PATCH 3/3] add isolate pages vmstat

From: Minchan Kim
Date: Thu Jul 16 2009 - 00:22:48 EST


On Thu, Jul 16, 2009 at 12:16 PM, Andrew
Morton<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Thu, 16 Jul 2009 09:55:47 +0900 (JST) KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>
>> ChangeLog
>> Â Since v5
>> Â Â- Rewrote the description
>> Â Â- Treat page migration
>> Â Since v4
>> Â Â- Changed displaing order in show_free_areas() (as Wu's suggested)
>> Â Since v3
>> Â Â- Fixed misaccount page bug when lumby reclaim occur
>> Â Since v2
>> Â Â- Separated IsolateLRU field to Isolated(anon) and Isolated(file)
>> Â Since v1
>> Â Â- Renamed IsolatePages to IsolatedLRU
>>
>> ==================================
>> Subject: [PATCH] add isolate pages vmstat
>>
>> If the system is running a heavy load of processes then concurrent reclaim
>> can isolate a large numbe of pages from the LRU. /proc/meminfo and the
>> output generated for an OOM do not show how many pages were isolated.
>>
>> This patch shows the information about isolated pages.
>>
>>
>> reproduce way
>> -----------------------
>> % ./hackbench 140 process 1000
>> Â Â=> OOM occur
>>
>> active_anon:146 inactive_anon:0 isolated_anon:49245
>> Âactive_file:79 inactive_file:18 isolated_file:113
>> Âunevictable:0 dirty:0 writeback:0 unstable:0 buffer:39
>> Âfree:370 slab_reclaimable:309 slab_unreclaimable:5492
>> Âmapped:53 shmem:15 pagetables:28140 bounce:0
>>
>>
>> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
>> Acked-by: Rik van Riel <riel@xxxxxxxxxx>
>> Acked-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
>> Reviewed-by: Minchan Kim <minchan.kim@xxxxxxxxx>
>> ---
>> Âdrivers/base/node.c  Â|  Â4 ++++
>> Âfs/proc/meminfo.c   Â|  Â4 ++++
>> Âinclude/linux/mmzone.h | Â Â2 ++
>> Âmm/migrate.c      |  11 +++++++++++
>> Âmm/page_alloc.c    Â|  12 +++++++++---
>> Âmm/vmscan.c      Â|  12 +++++++++++-
>> Âmm/vmstat.c      Â|  Â2 ++
>> Â7 files changed, 43 insertions(+), 4 deletions(-)
>>
>> Index: b/fs/proc/meminfo.c
>> ===================================================================
>> --- a/fs/proc/meminfo.c
>> +++ b/fs/proc/meminfo.c
>> @@ -65,6 +65,8 @@ static int meminfo_proc_show(struct seq_
>> Â Â Â Â Â Â Â "Active(file): Â %8lu kB\n"
>> Â Â Â Â Â Â Â "Inactive(file): %8lu kB\n"
>> Â Â Â Â Â Â Â "Unevictable: Â Â%8lu kB\n"
>> + Â Â Â Â Â Â "Isolated(anon): %8lu kB\n"
>> + Â Â Â Â Â Â "Isolated(file): %8lu kB\n"
>> Â Â Â Â Â Â Â "Mlocked: Â Â Â Â%8lu kB\n"
>
> Are these counters really important enough to justify being present in
> /proc/meminfo? ÂThey seem fairly low-level developer-only details.
> Perhaps relegate them to /proc/vmstat?
>
>> Â#ifdef CONFIG_HIGHMEM
>> Â Â Â Â Â Â Â "HighTotal: Â Â Â%8lu kB\n"
>> @@ -110,6 +112,8 @@ static int meminfo_proc_show(struct seq_
>> Â Â Â Â Â Â Â K(pages[LRU_ACTIVE_FILE]),
>> Â Â Â Â Â Â Â K(pages[LRU_INACTIVE_FILE]),
>> Â Â Â Â Â Â Â K(pages[LRU_UNEVICTABLE]),
>> + Â Â Â Â Â Â K(global_page_state(NR_ISOLATED_ANON)),
>> + Â Â Â Â Â Â K(global_page_state(NR_ISOLATED_FILE)),
>> Â Â Â Â Â Â Â K(global_page_state(NR_MLOCK)),
>> Â#ifdef CONFIG_HIGHMEM
>> Â Â Â Â Â Â Â K(i.totalhigh),
>> Index: b/include/linux/mmzone.h
>> ===================================================================
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -100,6 +100,8 @@ enum zone_stat_item {
>> Â Â Â NR_BOUNCE,
>> Â Â Â NR_VMSCAN_WRITE,
>> Â Â Â NR_WRITEBACK_TEMP, Â Â Â/* Writeback using temporary buffers */
>> + Â Â NR_ISOLATED_ANON, Â Â Â /* Temporary isolated pages from anon lru */
>> + Â Â NR_ISOLATED_FILE, Â Â Â /* Temporary isolated pages from file lru */
>> Â Â Â NR_SHMEM, Â Â Â Â Â Â Â /* shmem pages (included tmpfs/GEM pages) */
>> Â#ifdef CONFIG_NUMA
>> Â Â Â NUMA_HIT, Â Â Â Â Â Â Â /* allocated in intended node */
>> Index: b/mm/page_alloc.c
>> ===================================================================
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2115,16 +2115,18 @@ void show_free_areas(void)
>> Â Â Â Â Â Â Â }
>> Â Â Â }
>>
>> - Â Â printk("Active_anon:%lu active_file:%lu inactive_anon:%lu\n"
>> - Â Â Â Â Â Â " inactive_file:%lu"
>> + Â Â printk("active_anon:%lu inactive_anon:%lu isolated_anon:%lu\n"
>> + Â Â Â Â Â Â " active_file:%lu inactive_file:%lu isolated_file:%lu\n"
>> Â Â Â Â Â Â Â " unevictable:%lu"
>> Â Â Â Â Â Â Â " dirty:%lu writeback:%lu unstable:%lu buffer:%lu\n"
>> Â Â Â Â Â Â Â " free:%lu slab_reclaimable:%lu slab_unreclaimable:%lu\n"
>> Â Â Â Â Â Â Â " mapped:%lu shmem:%lu pagetables:%lu bounce:%lu\n",
>> Â Â Â Â Â Â Â global_page_state(NR_ACTIVE_ANON),
>> - Â Â Â Â Â Â global_page_state(NR_ACTIVE_FILE),
>> Â Â Â Â Â Â Â global_page_state(NR_INACTIVE_ANON),
>> + Â Â Â Â Â Â global_page_state(NR_ISOLATED_ANON),
>> + Â Â Â Â Â Â global_page_state(NR_ACTIVE_FILE),
>> Â Â Â Â Â Â Â global_page_state(NR_INACTIVE_FILE),
>> + Â Â Â Â Â Â global_page_state(NR_ISOLATED_FILE),
>> Â Â Â Â Â Â Â global_page_state(NR_UNEVICTABLE),
>> Â Â Â Â Â Â Â global_page_state(NR_FILE_DIRTY),
>> Â Â Â Â Â Â Â global_page_state(NR_WRITEBACK),
>> @@ -2152,6 +2154,8 @@ void show_free_areas(void)
>> Â Â Â Â Â Â Â Â Â Â Â " active_file:%lukB"
>> Â Â Â Â Â Â Â Â Â Â Â " inactive_file:%lukB"
>> Â Â Â Â Â Â Â Â Â Â Â " unevictable:%lukB"
>> + Â Â Â Â Â Â Â Â Â Â " isolated(anon):%lukB"
>> + Â Â Â Â Â Â Â Â Â Â " isolated(file):%lukB"
>> Â Â Â Â Â Â Â Â Â Â Â " present:%lukB"
>> Â Â Â Â Â Â Â Â Â Â Â " mlocked:%lukB"
>> Â Â Â Â Â Â Â Â Â Â Â " dirty:%lukB"
>> @@ -2178,6 +2182,8 @@ void show_free_areas(void)
>> Â Â Â Â Â Â Â Â Â Â Â K(zone_page_state(zone, NR_ACTIVE_FILE)),
>> Â Â Â Â Â Â Â Â Â Â Â K(zone_page_state(zone, NR_INACTIVE_FILE)),
>> Â Â Â Â Â Â Â Â Â Â Â K(zone_page_state(zone, NR_UNEVICTABLE)),
>> + Â Â Â Â Â Â Â Â Â Â K(zone_page_state(zone, NR_ISOLATED_ANON)),
>> + Â Â Â Â Â Â Â Â Â Â K(zone_page_state(zone, NR_ISOLATED_FILE)),
>> Â Â Â Â Â Â Â Â Â Â Â K(zone->present_pages),
>> Â Â Â Â Â Â Â Â Â Â Â K(zone_page_state(zone, NR_MLOCK)),
>> Â Â Â Â Â Â Â Â Â Â Â K(zone_page_state(zone, NR_FILE_DIRTY)),
>> Index: b/mm/vmscan.c
>> ===================================================================
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1067,6 +1067,8 @@ static unsigned long shrink_inactive_lis
>> Â Â Â Â Â Â Â unsigned long nr_active;
>> Â Â Â Â Â Â Â unsigned int count[NR_LRU_LISTS] = { 0, };
>> Â Â Â Â Â Â Â int mode = lumpy_reclaim ? ISOLATE_BOTH : ISOLATE_INACTIVE;
>> + Â Â Â Â Â Â unsigned long nr_anon;
>> + Â Â Â Â Â Â unsigned long nr_file;
>>
>> Â Â Â Â Â Â Â nr_taken = sc->isolate_pages(sc->swap_cluster_max,
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â&page_list, &nr_scan, sc->order, mode,
>> @@ -1097,6 +1099,10 @@ static unsigned long shrink_inactive_lis
>> Â Â Â Â Â Â Â __mod_zone_page_state(zone, NR_INACTIVE_ANON,
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â -count[LRU_INACTIVE_ANON]);
>>
>> + Â Â Â Â Â Â nr_anon = count[LRU_ACTIVE_ANON] + count[LRU_INACTIVE_ANON];
>> + Â Â Â Â Â Â nr_file = count[LRU_ACTIVE_FILE] + count[LRU_INACTIVE_FILE];
>> + Â Â Â Â Â Â __mod_zone_page_state(zone, NR_ISOLATED_ANON, nr_anon);
>> + Â Â Â Â Â Â __mod_zone_page_state(zone, NR_ISOLATED_FILE, nr_file);
>>
>> Â Â Â Â Â Â Â reclaim_stat->recent_scanned[0] += count[LRU_INACTIVE_ANON];
>> Â Â Â Â Â Â Â reclaim_stat->recent_scanned[0] += count[LRU_ACTIVE_ANON];
>> @@ -1164,6 +1170,9 @@ static unsigned long shrink_inactive_lis
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â spin_lock_irq(&zone->lru_lock);
>> Â Â Â Â Â Â Â Â Â Â Â }
>> Â Â Â Â Â Â Â }
>> + Â Â Â Â Â Â __mod_zone_page_state(zone, NR_ISOLATED_ANON, -nr_anon);
>> + Â Â Â Â Â Â __mod_zone_page_state(zone, NR_ISOLATED_FILE, -nr_file);
>> +
>> Â Â Â } while (nr_scanned < max_scan);
>
> This is a non-trivial amount of extra stuff. ÂDo we really need it?
>

I thought so.
This patch results form process fork bomb(ex, mstctl11 in LTP).
Too many isolated patches are based on isolation counter.

So, I think we need this until now.
If we can solve the problem with different method, then we can drop this.

--
Kind regards,
Minchan Kim
N‹§²æìr¸›yúèšØb²X¬¶ÇvØ^–)Þ{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü}©ž²ÆzÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹®w¥¢¸?™¨è­Ú&¢)ßf”ù^jÇy§m…á@A«a¶Úÿ 0¶ìh®å’i