Re: Kernel 5.1.15 stuck in compaction

From: Max Kellermann
Date: Mon Jul 08 2019 - 08:23:37 EST


On 2019/07/08 12:35, Max Kellermann <max@xxxxxxxx> wrote:
> one of our web servers got repeatedly stuck in the memory compaction
> code; two PHP processes have been busy at 100% inside memory
> compaction after a page fault:

This trace maybe helpful as well; the first PHP process:

275.846 compaction:mm_compaction_isolate_migratepages:range=(0x8a48e0 ~ 0x8a48e0) nr_scanned=0 nr_taken=0
LOST 8 events!
275.894 compaction:mm_compaction_isolate_migratepages:range=(0x8a48e0 ~ 0x8a48e0) nr_scanned=0 nr_taken=0
LOST 8 events!
275.942 compaction:mm_compaction_isolate_migratepages:range=(0x8a48e0 ~ 0x8a48e0) nr_scanned=0 nr_taken=0
LOST 8 events!
275.989 compaction:mm_compaction_isolate_migratepages:range=(0x8a48e0 ~ 0x8a48e0) nr_scanned=0 nr_taken=0

This is the other PHP process:

188.501 compaction:mm_compaction_isolate_migratepages:range=(0x169f40 ~ 0x169f40) nr_scanned=0 nr_taken=0
LOST 16 events!
188.600 compaction:mm_compaction_isolate_migratepages:range=(0x169f40 ~ 0x169f40) nr_scanned=0 nr_taken=0
LOST 5 events!
188.643 compaction:mm_compaction_isolate_migratepages:range=(0x169f40 ~ 0x169f40) nr_scanned=0 nr_taken=0
LOST 17 events!
188.742 compaction:mm_compaction_isolate_migratepages:range=(0x169f40 ~ 0x169f40) nr_scanned=0 nr_taken=0

No pages are being scanned at all, start and end are the same.

However, since my perf report contains calls to
compact_unlock_should_abort(), this means that the loop in
isolate_migratepages_block() is not getting skipped completely,
therefore the loop is just exiting too early.