Re: [PATCH 04/27] mm, vmscan: Begin reclaiming pages on a per-node basis

From: Mel Gorman
Date: Thu Jun 23 2016 - 06:59:05 EST


On Wed, Jun 22, 2016 at 04:04:34PM +0200, Vlastimil Babka wrote:
> >-static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
> >+static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc,
> >+ enum zone_type classzone_idx)
> > {
> > struct zoneref *z;
> > struct zone *zone;
> > unsigned long nr_soft_reclaimed;
> > unsigned long nr_soft_scanned;
> > gfp_t orig_mask;
> >- enum zone_type requested_highidx = gfp_zone(sc->gfp_mask);
> >
> > /*
> > * If the number of buffer_heads in the machine exceeds the maximum
> >@@ -2560,15 +2579,20 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
> >
> > for_each_zone_zonelist_nodemask(zone, z, zonelist,
> > gfp_zone(sc->gfp_mask), sc->nodemask) {
>
> Using sc->reclaim_idx could be faster/nicer here than gfp_zone()?

Yes, then the reclaim_idx and classzone_idx needs to be updated if
buffer_heads_over_limit in the check above but that is better anyway.

> Although after "mm, vmscan: Update classzone_idx if buffer_heads_over_limit"
> there would need to be a variable for the highmem adjusted value - maybe
> reuse "requested_highidx"? Not important though.
>

I think it's ok in the buffer_heads_over_limit case to reclaim
from more zones than requested. It may require another pass through
do_try_to_free_pages if a low zone was not reclaimed and required by the
caller but that's ok and expected if there are too many buffer_heads.

> >- enum zone_type classzone_idx;
> >-
> > if (!populated_zone(zone))
> > continue;
> >
> >- classzone_idx = requested_highidx;
> >+ /*
> >+ * Note that reclaim_idx does not change as it is the highest
> >+ * zone reclaimed from which for empty zones is a no-op but
> >+ * classzone_idx is used by shrink_node to test if the slabs
> >+ * should be shrunk on a given node.
> >+ */
> > while (!populated_zone(zone->zone_pgdat->node_zones +
> >- classzone_idx))
> >+ classzone_idx)) {
> > classzone_idx--;
> >+ continue;
> >+ }
> >
> > /*
> > * Take care memory controller reclaiming has small influence
> >@@ -2594,8 +2618,8 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
> > */
> > if (IS_ENABLED(CONFIG_COMPACTION) &&
> > sc->order > PAGE_ALLOC_COSTLY_ORDER &&
> >- zonelist_zone_idx(z) <= requested_highidx &&
> >- compaction_ready(zone, sc->order, requested_highidx)) {
> >+ zonelist_zone_idx(z) <= classzone_idx &&
> >+ compaction_ready(zone, sc->order, classzone_idx)) {
> > sc->compaction_ready = true;
> > continue;
> > }
> >@@ -2615,7 +2639,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
> > /* need some check for avoid more shrink_zone() */
> > }
> >
> >- shrink_zone(zone, sc, zone_idx(zone) == classzone_idx);
> >+ shrink_node(zone->zone_pgdat, sc, classzone_idx);
> > }
> >
> > /*
> >@@ -2647,6 +2671,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
> > int initial_priority = sc->priority;
> > unsigned long total_scanned = 0;
> > unsigned long writeback_threshold;
> >+ enum zone_type classzone_idx = sc->reclaim_idx;
>
> Hmm, try_to_free_mem_cgroup_pages() seems to call this with sc->reclaim_idx
> not explicitly inirialized (e.g. 0). And shrink_all_memory() as well. I
> probably didn't check them in v6 and pointed out only try_to_free_pages()
> (which is now OK), sorry.
>

That gets fixed in "mm, memcg: move memcg limit enforcement from zones
to nodes" but I can move the hunk to this patch to make bisection a
little easier.

> > retry:
> > delayacct_freepages_start();
> >
> >@@ -2657,7 +2682,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
> > vmpressure_prio(sc->gfp_mask, sc->target_mem_cgroup,
> > sc->priority);
> > sc->nr_scanned = 0;
> >- shrink_zones(zonelist, sc);
> >+ shrink_zones(zonelist, sc, classzone_idx);
>
> Looks like classzone_idx here is only used here to pass to shrink_zones()
> unchanged, which means it can just use it directly without a new param?
>

Yes

--
Mel Gorman
SUSE Labs