Re: [PATCH] mm: compaction: Abort compaction if too many pages areisolated and caller is asynchronous

From: Minchan Kim
Date: Tue May 31 2011 - 09:33:57 EST


On Tue, May 31, 2011 at 02:24:37PM +0200, Andrea Arcangeli wrote:
> On Tue, May 31, 2011 at 09:16:20PM +0900, Minchan Kim wrote:
> > I am not sure this is related to the problem you have seen.
> > If he used hwpoison by madivse, it is possible.
>
> CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
> # CONFIG_MEMORY_FAILURE is not set
>
> > Anyway, we can see negative value by count mismatch in UP build.
> > Let's fix it.
>
> Definitely let's fix it, but it's probably not related to this one.
>
> >
> > From 1d3ebce2e8aa79dcc912da16b7a8d0611b6f9f1a Mon Sep 17 00:00:00 2001
> > From: Minchan Kim <minchan.kim@xxxxxxxxx>
> > Date: Tue, 31 May 2011 21:11:58 +0900
> > Subject: [PATCH] Fix page isolated count mismatch
> >
> > If migration is failed, normally we call putback_lru_pages which
> > decreases NR_ISOLATE_[ANON|FILE].
> > It means we should increase NR_ISOLATE_[ANON|FILE] before calling
> > putback_lru_pages. But soft_offline_page dosn't it.
> >
> > It can make NR_ISOLATE_[ANON|FILE] with negative value and in UP build,
> > zone_page_state will say huge isolated pages so too_many_isolated
> > functions be deceived completely. At last, some process stuck in D state
> > as it expect while loop ending with congestion_wait.
> > But it's never ending story.
> >
> > If it is right, it would be -stable stuff.
> >
> > Cc: Mel Gorman <mel@xxxxxxxxx>
> > Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > Signed-off-by: Minchan Kim <minchan.kim@xxxxxxxxx>
> > ---
> > mm/memory-failure.c | 4 +++-
> > 1 files changed, 3 insertions(+), 1 deletions(-)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 5c8f7e0..eac0ba5 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -52,6 +52,7 @@
> > #include <linux/swapops.h>
> > #include <linux/hugetlb.h>
> > #include <linux/memory_hotplug.h>
> > +#include <linux/mm_inline.h>
> > #include "internal.h"
> >
> > int sysctl_memory_failure_early_kill __read_mostly = 0;
> > @@ -1468,7 +1469,8 @@ int soft_offline_page(struct page *page, int flags)
> > put_page(page);
> > if (!ret) {
> > LIST_HEAD(pagelist);
> > -
> > + inc_zone_page_state(page, NR_ISOLATED_ANON +
> > + page_is_file_cache(page));
> > list_add(&page->lru, &pagelist);
> > ret = migrate_pages(&pagelist, new_page, MPOL_MF_MOVE_ALL,
> > 0, true);
>
> Reviewed-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>

Thanks, Andrea.

>
> Let's check all other migrate_pages callers too...

I checked them before sending patch but I got failed to find strange things. :(
Now I am checking the page's SwapBacked flag can be changed
between before and after of migrate_pages so accounting of NR_ISOLATED_XX can
make mistake. I am approaching the failure, too. Hmm.


--
Kind regards
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/