Re: [RFC 13/13] mm, compaction: fix and improve watermark handling

From: Michal Hocko
Date: Wed May 18 2016 - 10:28:00 EST


On Wed 18-05-16 14:50:04, Mel Gorman wrote:
> On Mon, May 16, 2016 at 11:25:05AM +0200, Michal Hocko wrote:
> > On Tue 10-05-16 09:36:03, Vlastimil Babka wrote:
> > > Compaction has been using watermark checks when deciding whether it was
> > > successful, and whether compaction is at all suitable. There are few problems
> > > with these checks.
> > >
> > > - __compact_finished() uses low watermark in a check that has to pass if
> > > the direct compaction is to finish and allocation should succeed. This is
> > > too pessimistic, as the allocation will typically use min watermark. It
> > > may happen that during compaction, we drop below the low watermark (due to
> > > parallel activity), but still form the target high-order page. By checking
> > > against low watermark, we might needlessly continue compaction. After this
> > > patch, the check uses direct compactor's alloc_flags to determine the
> > > watermark, which is effectively the min watermark.
> >
> > OK, this makes some sense. It would be great if we could have at least
> > some clarification why the low wmark has been used previously. Probably
> > Mel can remember?
> >
>
> Two reasons -- it was a very rough estimate of whether enough pages are free
> for compaction to have any chance. Secondly, it was to minimise the risk
> that compaction would isolate so many pages that the zone was completely
> depleted. This was a concern during the initial prototype of compaction.
>
> > > - __compaction_suitable() then checks the low watermark plus a (2 << order) gap
> > > to decide if there's enough free memory to perform compaction. This check
> >
> > And this was a real head scratcher when I started looking into the
> > compaction recently. Why do we need to be above low watermark to even
> > start compaction. Compaction uses additional memory only for a short
> > period of time and then releases the already migrated pages.
> >
>
> Simply minimising the risk that compaction would deplete the entire
> zone. Sure, it hands pages back shortly afterwards. At the time of the
> initial prototype, page migration was severely broken and the system was
> constantly crashing. The cautious checks were left in place after page
> migration was fixed as there wasn't a compelling reason to remove them
> at the time.

OK, then moving to min_wmark + bias from low_wmark should work, right?
This would at least remove the discrepancy between the reclaim and
compaction thresholds to some degree. Which is good IMHO.

Thanks!
--
Michal Hocko
SUSE Labs