Re: [PATCH 03/27] mm, vmstat: Add infrastructure for per-node vmstats
From: Mel Gorman
Date: Wed Feb 24 2016 - 04:19:23 EST
On Tue, Feb 23, 2016 at 10:13:18AM -0800, Johannes Weiner wrote:
> On Tue, Feb 23, 2016 at 03:04:26PM +0000, Mel Gorman wrote:
> > VM statistic counters for reclaim decisions are zone-based. If the kernel
> > is to reclaim on a per-node basis then we need to track per-node statistics
> > but there is no infrastructure for that. The most notable change is that
> > the old node_page_state is renamed to sum_zone_node_page_state. The new
> > node_page_state takes a pglist_data and uses per-node stats but none exist
> > yet. There is some renaming such as vm_stat to vm_zone_stat and the addition
> > of vm_node_stat and the renaming of mod_state to mod_zone_state. Otherwise,
> > this is mostly a mechanical patch with no functional change. There is a
> > lot of similarity between the node and zone helpers which is unfortunate
> > but there was no obvious way of reusing the code and maintaining type safety.
> >
> > Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>
> Hopefully we can eventually ditch /proc/zoneinfo in favor of a
> /proc/nodeinfo and get rid of the per-zone stats accounting.
>
It may not be possible to ditch /proc/zoneinfo entirely but a /proc/nodeinfo
would make sense. It may interfere with userspace that's aware of kernel
internals but that may be manageable.
> In general, this patch looks good to me.
>
> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
>
Thanks.
> Only one thing I noticed:
>
> > @@ -349,12 +349,14 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
> > shadow_nodes = list_lru_shrink_count(&workingset_shadow_nodes, sc);
> > local_irq_enable();
> >
> > - if (memcg_kmem_enabled())
> > + if (memcg_kmem_enabled()) {
> > pages = mem_cgroup_node_nr_lru_pages(sc->memcg, sc->nid,
> > LRU_ALL_FILE);
> > - else
> > - pages = node_page_state(sc->nid, NR_ACTIVE_FILE) +
> > - node_page_state(sc->nid, NR_INACTIVE_FILE);
> > + } else {
> > + pg_data_t *pgdat = NODE_DATA(sc->nid);
> > + pages = node_page_state(pgdat, NR_ACTIVE_FILE) +
> > + node_page_state(pgdat, NR_INACTIVE_FILE);
> > + }
>
> That should also be sum_zone_node_page_state, right? These are not
> valid node items (yet).
Yep, not for another two patches. Fixed now.
--
Mel Gorman
SUSE Labs