Re: [PATCH] mm: Do not stall in synchronous compaction for THPallocations

From: Mel Gorman
Date: Wed Nov 16 2011 - 10:07:54 EST


On Wed, Nov 16, 2011 at 02:30:56PM +0100, Andrea Arcangeli wrote:
> On Wed, Nov 16, 2011 at 05:13:50AM +0100, Andrea Arcangeli wrote:
> > After checking my current thp vmstat I think Andrew was right and we
> > backed out for a good reason before. I'm getting significantly worse
> > success rate, not sure why it was a small reduction in success rate
> > but hey I cannot exclude I may have broke something with some other
> > patch. I've been running it together with a couple more changes. If
> > it's this change that reduced the success rate, I'm afraid going
> > always async is not ok.
>
> I wonder if the high failure rate when shutting off "sync compaction"
> and forcing only "async compaction" for THP (your patch queued in -mm)
> is also because of ISOLATE_CLEAN being set in compaction from commit
> 39deaf8. ISOLATE_CLEAN skipping PageDirty means all tmpfs/anon pages
> added to swapcache (or removed from swapcache which sets the dirty bit
> on the page because the pte may be mapped clean) are skipped entirely
> by async compaction for no good reason.

Good point! Even though these pages can be migrated without IO or
incurring a sync, we are skipping over them. I'm still
looking at passing sync down to ->migratepages and as part of that,
ISOLATE_CLEAN will need new smarts.

> That can't possibly be ok,
> because those don't actually require any I/O or blocking to be
> migrated. PageDirty is a "blocking/IO" operation only for filebacked
> pages. So I think we must revert 39deaf8, instead of cleaning it up
> with my cleanup posted in Message-Id 20111115020831.GF4414@xxxxxxxxxx .
>

It would be preferable if the pages that would block during migration
could be identified in advance but that may be unrealistic. What may be
a better compromise is to only isolate PageDirty pages with a
->migratepage callback.

> ISOLATED_CLEAN still looks right for may_writepage, for reclaim dirty
> bit set on the page is a I/O event, for migrate it's not if it's
> tmpfs/anon.
>
> Did you run your compaction tests with some swap activity?
>

Some, but not intensive.

> Reducing the async compaction effectiveness while there's some swap
> activity then also leads in more frequently than needed running sync
> compaction and page reclaim.
>
> I'm hopeful however that by running just 2 passes of migrate_pages
> main loop with the "avoid overwork in migrate sync mode" patch, we can
> fix the excessive hanging. If that works number of passes could
> actually be a tunable, and setting it to 1 (instead of 2) would then
> provide 100% "async compaction" behavior again. And if somebody
> prefers to stick to 10 he can... so then he can do trylock pass 0,
> lock_page pass1, wait_writeback pass2, wait pin pass3, finally migrate
> pass4. (something 2 passes alone won't allow). So making the migrate
> passes/force-threshold tunable (maybe only for the new sync=2
> migration mode) could be good idea. Or we could just return to sync
> true/false and have the migration tunable affect everything but that
> would alter the reliability of sys_move_pages and other numa things
> too, where I guess 10 passes are ok. This is why I added a sync=2 mode
> for migrate.

I am vaguely concerned that this will just make the stalling harder to
reproduce and diagnose. While you investigate this route, I'm going to
keep investigating using only async migration for THP and having async
compaction move pages it can migrate without blocking.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/