Re: [PATCH] mm: page_alloc: dump migrate-failed pages only at -EBUSY

From: David Hildenbrand
Date: Fri May 21 2021 - 04:10:52 EST


On 20.05.21 22:51, Minchan Kim wrote:
On Thu, May 20, 2021 at 09:28:09PM +0200, David Hildenbrand wrote:
Minchan Kim <minchan@xxxxxxxxxx> schrieb am Do. 20. Mai 2021 um 21:20:

On Wed, May 19, 2021 at 02:33:41PM -0700, Minchan Kim wrote:
alloc_contig_dump_pages aims for helping debugging page migration
failure by page refcount mismatch or something else of page itself
from migration handler function. However, in -ENOMEM case, there is
nothing to get clue from page descriptor information so just
dump pages only when -EBUSY happens.

Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
---
mm/page_alloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3100fcb08500..c0a2971dc755 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8760,7 +8760,8 @@ static int __alloc_contig_migrate_range(struct
compact_control *cc,

lru_cache_enable();
if (ret < 0) {
- alloc_contig_dump_pages(&cc->migratepages);
+ if (ret == -EBUSY)
+ alloc_contig_dump_pages(&cc->migratepages);
putback_movable_pages(&cc->migratepages);
return ret;
}
--
2.31.1.751.gd2f1c929bd-goog


Resend with a little modifying description.

From c5a2fea291cf46079b87cc9ac9a25fc7f819d0fd Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@xxxxxxxxxx>
Date: Wed, 19 May 2021 14:22:18 -0700
Subject: [PATCH] mm: page_alloc: dump migrate-failed pages only at -EBUSY

alloc_contig_dump_pages aims for helping debugging page migration
failure by elevated page refcount compared to expected_count.
(for the detail, please look at migrate_page_move_mapping)

However, -ENOMEM is just the case that system is under memory
pressure state, not relevant with page refcount at all. Thus,
the dumping page list is not helpful for the debugging point of view.


what about -ENOMEM when migrating empty/free huge pages? I think there is
value in having the pages dumped to identify something like that. And it
doesn‘t require heavy memory pressure to fail allocating a huge page.


-ENOMEM means there is no memory to alloate destination page.
How could it help dumping source pages in those case from dump_page
content point of view?

You would spot a huge page in the source list (usually at first position) without any obvious migration blockers I assume?

I'm wondering, did you actually run into this being suboptimal? If it's a real problem dumping too many stuff when running into -ENOMEM, fine with me. If it's a theoretical issue, I'd prefer to just keep it simple as is.

--
Thanks,

David / dhildenb