Re: [patch v4] mm, thp: only collapse hugepages to nodes with affinity for zone_reclaim_mode

From: Mel Gorman
Date: Fri Jul 25 2014 - 11:34:24 EST

On Thu, Jul 17, 2014 at 02:48:07PM -0700, David Rientjes wrote:
> Commit 9f1b868a13ac ("mm: thp: khugepaged: add policy for finding target
> node") improved the previous khugepaged logic which allocated a
> transparent hugepages from the node of the first page being collapsed.
> However, it is still possible to collapse pages to remote memory which may
> suffer from additional access latency. With the current policy, it is
> possible that 255 pages (with PAGE_SHIFT == 12) will be collapsed remotely
> if the majority are allocated from that node.
> When zone_reclaim_mode is enabled, it means the VM should make every attempt
> to allocate locally to prevent NUMA performance degradation. In this case,
> we do not want to collapse hugepages to remote nodes that would suffer from
> increased access latency. Thus, when zone_reclaim_mode is enabled, only
> allow collapsing to nodes with RECLAIM_DISTANCE or less.
> There is no functional change for systems that disable zone_reclaim_mode.
> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>

The patch looks ok for what it is intended to do so

Acked-by: Mel Gorman <mgorman@xxxxxxx>

However, I would consider it likely that pages allocated on different nodes
within a hugepage boundary indicates that multiple threads on different nodes
are accessing those pages. I would be skeptical that reduced TLB misses
offset remote access penalties. Should we simply refuse to collapse huge
pages when the 4K pages are allocated from different nodes? If automatic
NUMA balancing is enabled and the access are really coming from one node
then the 4K pages will eventually be migrated to a local node and then
khugepaged can collapse it.

Mel Gorman
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at