Re: [RFC 04/13] mm, page_alloc: restructure direct compaction handling in slowpath

From: Vlastimil Babka
Date: Fri May 13 2016 - 04:11:44 EST


On 05/12/2016 03:29 PM, Michal Hocko wrote:
On Tue 10-05-16 09:35:54, Vlastimil Babka wrote:
This patch attempts to restructure the code with only minimal functional
changes. The call to the first compaction and THP-specific checks are now
placed above the retry loop, and the "noretry" direct compaction is removed.

The initial compaction is additionally restricted only to costly orders, as we
can expect smaller orders to be held back by watermarks, and only larger orders
to suffer primarily from fragmentation. This better matches the checks in
reclaim's shrink_zones().

There are two other smaller functional changes. One is that the upgrade from
async migration to light sync migration will always occur after the initial
compaction.

I do not think this belongs to the patch. There are two reasons. First
we do not need to do potentially more expensive sync mode when async is
able to make some progress and the second

My concern was that __GFP_NORETRY non-costly allocations wouldn't otherwise get a MIGRATE_SYNC_LIGHT pass at all. Previously they would get it in the noretry: label. So do you think it's a corner case not worth caring about? Alternatively we could also remove the 'restriction of initial async compaction to costly orders' from this patch and apply it separately later. That would also result in the flip to sync_light after the initial async attempt for these allocations.

is that with the currently
fragile compaction implementation this might reintroduce the premature
OOM for order-2 requests reported by Hugh. Please see
http://lkml.kernel.org/r/alpine.LSU.2.11.1604141114290.1086@xxxxxxxxxxxx

Hmm IIRC that involved some wrong conflict resolution in mmotm? I don't remember what the code exactly did look like, but wasn't the problem that the initial compaction was async, then the left-over hunk changed migration_mode to sync_light, and then should_compact_retry() thought "oh we already failed sync_light, return false" when in fact the sync_light compaction never happened? Otherwise I don't see how switching to sync_light "too early" could lead to premature OOMs.

Your later patch (which I haven't reviewed yet) is then changing this
considerably

Yes, my other concern with should_compact_retry() after your "mm, oom: protect !costly allocations some more" is that relying on compaction_failed() to upgrade the migration mode is unreliable. Async compaction can easily keep returning as contended, so might never see the COMPACT_COMPLETE result, if it's e.g. limited to nodes without a really small zone such as ZONE_DMA.

but I think it would be safer to not touch the migration
mode in this - mostly cleanup - patch.

This is how it has been until recent patch "mm, oom: protect
!costly allocations some more", which introduced upgrading the mode based on
COMPACT_COMPLETE result, but kept the final compaction always upgraded, which
made it even more special. It's better to return to the simpler handling for
now, as migration modes will be further modified later in the series.

The second change is that once both reclaim and compaction declare it's not
worth to retry the reclaim/compact loop, there is no final compaction attempt.
As argued above, this is intentional. If that final compaction were to succeed,
it would be due to a wrong retry decision, or simply a race with somebody else
freeing memory for us.

The main outcome of this patch should be simpler code. Logically, the initial
compaction without reclaim is the exceptional case to the reclaim/compaction
scheme, but prior to the patch, it was the last loop iteration that was
exceptional. Now the code matches the logic better. The change also enable the
following patches.

Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>

Other than the above thing I like this patch.
Acked-by: Michal Hocko <mhocko@xxxxxxxx>

Thanks!