Re: [PATCH 5/7] Add /proc trigger for memory compaction

From: Mel Gorman
Date: Thu Jan 21 2010 - 09:10:11 EST

On Wed, Jan 20, 2010 at 12:48:05PM -0800, David Rientjes wrote:
> On Wed, 20 Jan 2010, Mel Gorman wrote:
> > > With Lee's work on mempolicy-constrained hugepage allocations, there is a
> > > use-case for this explicit trigger to be exported via sysfs in the
> > > longterm:
> >
> > True, although the per-node structures are only available on NUMA making
> > it necessary to have two interfaces. The per-node one is handy enough
> > because it would be just
> >
> > /sys/devices/system/node/nodeX/compact_node
> > When written to, this node is compacted by the writing process
> >
> > But there does not appear to be a "good" way of having a non-NUMA
> > interface. /sys/devices/system/node does not exist .... Does anyone
> > remember why !NUMA does not have a /sys/devices/system/node/node0? Is
> > there a good reason or was there just no point?
> >
> There doesn't seem to be a usecase for a fake node0 sysfs entry since it
> would be a duplication of procfs.


> I think it would be best to create a global /proc/sys/vm/compact trigger
> that would walk all "compactable" zones system-wide

Easily done.

> and then a per-node
> /sys/devices/system/node/nodeX/compact trigger for that particular node,
> both with permissions 0200.

Will work on this as an additional patch. It should be straight-forward
with the only care needing to be taken around memory hotplug as usual.

> It would be helpful to be able to determine what is "compactable" at the
> same time by adding both global and per-node "compact_order" tunables that
> would default to HUGETLB_PAGE_ORDER.

Well, rather than having a separate tunable, writing a number to
/proc/sys/vm/compact could indicate the order if that trigger is now
working system-wide. Would that be suitable?

> Then, the corresponding "compact"
> trigger would only do work if fill_contig_page_info() shows
> !free_blocks_suitable for either all zones (global trigger) or each zone
> in the node's zonelist (per-node trigger).

Do you see a need for proc to act like this? I'm wondering if
try_to_compact_pages() already does what you're looking for except no
process is required to muck around in /proc or /sys.

I somewhat saw the /proc and /sys tunables being used for either debugging or
by a job scheduler that compacted one or more nodes before a new job started.

Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at