Re: [PATCH 00/35] AutoNUMA alpha14

From: Andrea Arcangeli
Date: Tue May 29 2012 - 13:16:15 EST


Hi Kirill,

The anon page was munmapped just after get_page_unless_zero obtained a
refcount in knuma_migrated. This can happen for example if a big
process exits while knuma_migrated starts to migrate the page. In that
case split_huge_page would do nothing but when it does nothing it
notifies the caller returning 1. When it returns 1, we just need to
put_page and bail out (the page isn't splitted in that case and it's
pointless to try to migrate a freed page).

I also made the code more strict now, to be sure the reason of the bug
wasn't an hugepage in the LRU that wasn't Anon, such a thing must not
exist, but this will verify it just in case.

I'll push it to the origin/autonuma branch of aa.git shortly
(rebased), could you try if it helps?

diff --git a/mm/autonuma.c b/mm/autonuma.c
index 3d4c2a7..c2a5a82 100644
--- a/mm/autonuma.c
+++ b/mm/autonuma.c
@@ -840,9 +840,17 @@ static int isolate_migratepages(struct list_head *migratepages,

VM_BUG_ON(nid != page_to_nid(page));

- if (PageAnon(page) && PageTransHuge(page))
+ if (PageTransHuge(page)) {
+ VM_BUG_ON(!PageAnon(page));
/* FIXME: remove split_huge_page */
- split_huge_page(page);
+ if (unlikely(split_huge_page(page))) {
+ autonuma_printk("autonuma migrate THP free\n");
+ __autonuma_migrate_page_remove(page,
+ page_autonuma);
+ put_page(page);
+ continue;
+ }
+ }

__autonuma_migrate_page_remove(page, page_autonuma);


Thanks a lot,
Andrea

BTW, interesting the knuma_migrated0 runs on CPU24, just in case, you
may also want to verify that it's correct with numactl --hardware, in
my case the highest cpuid in node0 is 17. It's not related to the
above, which is needed anyway.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/