Re: [patch] mm: vmscan: invoke slab shrinkers from shrink_zone()

From: Dave Chinner
Date: Thu Apr 16 2015 - 19:18:19 EST


On Thu, Apr 16, 2015 at 10:34:13AM -0400, Johannes Weiner wrote:
> On Thu, Apr 16, 2015 at 12:57:36PM +0900, Joonsoo Kim wrote:
> > This causes following success rate regression of phase 1,2 on stress-highalloc
> > benchmark. The situation of phase 1,2 is that many high order allocations are
> > requested while many threads do kernel build in parallel.
>
> Yes, the patch made the shrinkers on multi-zone nodes less aggressive.
> From the changelog:
>
> This changes kswapd behavior, which used to invoke the shrinkers for each
> zone, but with scan ratios gathered from the entire node, resulting in
> meaningless pressure quantities on multi-zone nodes.
>
> So the previous code *did* apply more pressure on the shrinkers, but
> it didn't make any sense. The number of slab objects to scan for each
> scanned LRU page depended on how many zones there were in a node, and
> their relative sizes. So a node with a large DMA32 and a small Normal
> would receive vastly different relative slab pressure than a node with
> only one big zone Normal. That's not something we should revert to.
>
> If we are too weak on objects compared to LRU pages then we should
> adjust DEFAULT_SEEKS or individual shrinker settings.

Now this thread has my attention. Changing shrinker defaults will
seriously upset the memory balance under load (in unpredictable
ways) so I really don't think we should even consider changing
DEFAULT_SEEKS.

If there's a shrinker imbalance, we need to understand which
shrinker needs rebalancing, then modify that shrinker's
configuration and then observe the impact this has on the rest of
the system. This means looking at variance of the memory footprint
in steady state, reclaim overshoot and damping rates before steady
state is acheived, etc. Balancing multiple shrinkers (especially
those with dependencies on other caches) under memory
load is a non-trivial undertaking.

I don't see any evidence that we have a shrinker imbalance, so I
really suspect the problem is "shrinkers aren't doing enough work".
In that case, we need to increase the pressure being generated, not
start fiddling around with shrinker configurations.

> If we think our pressure ratio is accurate but we don't reclaim enough
> compared to our compaction efforts, then any adjustments to improve
> huge page successrate should come from the allocator/compaction side.

Right - if compaction is failing, then the problem is more likely
that it isn't generating enough pressure, and so the shrinkers
aren't doing the work we are expecting them to do. That's a problem
with compaction, not the shrinkers...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/