Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

From: Mel Gorman
Date: Wed Jan 18 2017 - 12:48:48 EST


On Tue, Jan 17, 2017 at 03:54:51PM +0100, Michal Hocko wrote:
> On Tue 17-01-17 14:21:14, Mel Gorman wrote:
> > On Tue, Jan 17, 2017 at 02:52:28PM +0100, Michal Hocko wrote:
> > > On Mon 16-01-17 11:09:34, Mel Gorman wrote:
> > > [...]
> > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > index 532a2a750952..46aac487b89a 100644
> > > > --- a/mm/vmscan.c
> > > > +++ b/mm/vmscan.c
> > > > @@ -2684,6 +2684,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
> > > > continue;
> > > >
> > > > if (sc->priority != DEF_PRIORITY &&
> > > > + !buffer_heads_over_limit &&
> > > > !pgdat_reclaimable(zone->zone_pgdat))
> > > > continue; /* Let kswapd poll it */
> > >
> > > I think we should rather remove pgdat_reclaimable here. This sounds like
> > > a wrong layer to decide whether we want to reclaim and how much.
> > >
> >
> > I had considered that but it'd also be important to add the other 32-bit
> > patches you have posted to see the impact. Because of the ratio of LRU pages
> > to slab pages, it may not have an impact but it'd need to be eliminated.
>
> OK, Trevor you can pull from
> git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git tree
> fixes/highmem-node-fixes branch. This contains the current mmotm tree +
> the latest highmem fixes. I also do not expect this would help much in
> your case but as Mel've said we should rule that out at least.
>

After considering slab shrinking of lower nodes, it occurs to me that your
fixes also impacts slab shrinking. For lowmem-constrained allocations,
we accounted for scans on the lower zones but shrunk slabs proportional to
the total LRU size. If the lower zones had few LRU pages and were mostly
slab pages then the proportional calculation would be way off. This may
have a bigger impact on Trevor Cordes' situation that I had imagined at
the start of today.

--
Mel Gorman
SUSE Labs