Re: [PATCH V4 07/15] mm, compaction: khugepaged should not give up due to need_resched()

From: Mel Gorman
Date: Fri Jul 25 2014 - 08:31:19 EST


On Wed, Jul 16, 2014 at 03:48:15PM +0200, Vlastimil Babka wrote:
> Async compaction aborts when it detects zone lock contention or need_resched()
> is true. David Rientjes has reported that in practice, most direct async
> compactions for THP allocation abort due to need_resched(). This means that a
> second direct compaction is never attempted, which might be OK for a page
> fault, but khugepaged is intended to attempt a sync compaction in such case and
> in these cases it won't.
>
> This patch replaces "bool contended" in compact_control with an int that
> distinguieshes between aborting due to need_resched() and aborting due to lock
> contention. This allows propagating the abort through all compaction functions
> as before, but passing the abort reason up to __alloc_pages_slowpath() which
> decides when to continue with direct reclaim and another compaction attempt.
>
> Another problem is that try_to_compact_pages() did not act upon the reported
> contention (both need_resched() or lock contention) immediately and would
> proceed with another zone from the zonelist. When need_resched() is true, that
> means initializing another zone compaction, only to check again need_resched()
> in isolate_migratepages() and aborting. For zone lock contention, the
> unintended consequence is that the lock contended status reported back to the
> allocator is detrmined from the last zone where compaction was attempted, which
> is rather arbitrary.
>
> This patch fixes the problem in the following way:
> - async compaction of a zone aborting due to need_resched() or fatal signal
> pending means that further zones should not be tried. We report
> COMPACT_CONTENDED_SCHED to the allocator.
> - aborting zone compaction due to lock contention means we can still try
> another zone, since it has different set of locks. We report back
> COMPACT_CONTENDED_LOCK only if *all* zones where compaction was attempted,
> it was aborted due to lock contention.
>
> As a result of these fixes, khugepaged will proceed with second sync compaction
> as intended, when the preceding async compaction aborted due to need_resched().
> Page fault compactions aborting due to need_resched() will spare some cycles
> previously wasted by initializing another zone compaction only to abort again.
> Lock contention will be reported only when compaction in all zones aborted due
> to lock contention, and therefore it's not a good idea to try again after
> reclaim.
>
> Reported-by: David Rientjes <rientjes@xxxxxxxxxx>
> Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
> Cc: Minchan Kim <minchan@xxxxxxxxxx>
> Cc: Mel Gorman <mgorman@xxxxxxx>
> Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> Cc: Michal Nazarewicz <mina86@xxxxxxxxxx>
> Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxx>
> Cc: Rik van Riel <riel@xxxxxxxxxx>

Acked-by: Mel Gorman <mgorman@xxxxxxx>

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/