Re: [RFC PATCH 6/6] mm, compaction: don't migrate in blocks that cannot be fully compacted in async direct compaction

From: David Rientjes
Date: Thu Jun 05 2014 - 17:38:40 EST


On Thu, 5 Jun 2014, Vlastimil Babka wrote:

> > Ok, so this obsoletes my patchseries that did something similar. I hope
>
> Your patches 1/3 and 2/3 would still make sense. Checking alloc flags is IMHO
> better than checking async here. That way, hugepaged and kswapd would still
> try to migrate stuff which is important as Mel described in the reply to your
> 3/3.
>

Would you mind folding those two patches into your series since you'll be
requiring the gfp_mask in struct compact_control and your pageblock skip
is better than mine?

> > you can rebase this set on top of linux-next and then propose it formally
> > without the RFC tag.
>
> I posted this early to facilitate discussion, but if you want to test on
> linux-next then sure.
>

I'd love to test these.

> > We also need to discuss the scheduling heuristics, the reliance on
> > need_resched(), to abort async compaction. In testing, we actualy
> > sometimes see 2-3 pageblocks scanned before terminating and thp has a very
> > little chance of being allocated. At the same time, if we try to fault
> > 64MB of anon memory in and each of the 32 calls to compaction are
> > expensive but don't result in an order-9 page, we see very lengthy fault
> > latency.
>
> Yes, I thought you were about to try the 1GB per call setting. I don't
> currently have a test setup like you. My patch 1/6 still uses on
> need_resched() but that could be replaced with a later patch.
>

Agreed. I was thinking higher than 1GB would be possible once we have
your series that does the pageblock skip for thp, I think the expense
would be constant because we won't needlessly be migrating pages unless it
has a good chance at succeeding. I'm slightly concerned about the
COMPACT_CLUSTER_MAX termination, though, before we find unmigratable
memory but I think that will be very low probability.

> > I think it would be interesting to consider doing async compaction
> > deferral up to 1 << COMPACT_MAX_DEFER_SHIFT after a sysctl-configurable
> > amount of memory is scanned, at least for thp, and remove the scheduling
> > heuristic entirely.
>
> That could work. How about the lock contention heuristic? Is it possible on a
> large and/or busy system to compact anything substantional without hitting the
> lock contention? Are your observations about too early abort based on
> need_resched() or lock contention?
>

Eek, it's mostly need_resched() because we don't use zone->lru_lock, we
have the memcg lruvec locks for lru locking. We end up dropping and
reacquiring different locks based on the memcg of the page being isolated
quite a bit.

This does beg the question about parallel direct compactors, though, that
will be contending on the same coarse zone->lru_lock locks and immediately
aborting and falling back to PAGE_SIZE pages for thp faults that will be
more likely if your patch to grab the high-order page and return it to the
page allocator is merged.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/