Re: NUMA mempolicy /proc code in mainline shouldn't have been merged

From: Christoph Lameter
Date: Mon Sep 19 2005 - 12:11:53 EST


On Sun, 11 Sep 2005, Andi Kleen wrote:

> On Sunday 11 September 2005 08:51, Andrew Morton wrote:
>
> > Certainly I can see value in that. How can a developer test his
> > code without any form of runtime feedback?
>
> There are already several ways to do that: first the counters output
> by numastat (local_node, other_node, interleave_hit etc.), which tells you
> exactly how the allocation strategy ended up. And a process can find out
> on which node a specific page is using get_mempolicy()

get_mempolicy only works from inside the thread that is executing. This
means all applications would need some specialized library to dump the
memory allocation information and you would need some external means of
triggering the dump of node information.

The other methods via numastat gives aggregate values that not useful at
all if multiple processes are running. Yes, on a bare machine with one
process running one could determine the memory allocated by searching
through all nodes in order to guess where the memory was allocated (try
that on a box with hundreds of nodes!).

However, one still does not know which memory section (vma) is allocated
on which nodes. And this may be important since critical data may need to
be interleaved whereas other massive amounts of data may be organized in a
different way.

> If you really want to know what's going on you can use performance counters
> of the machine to tell you the amount of cross node traffic
> (e.g. see numamon in the numactl source tree as an example)

Cross node traffic is not the same as memory use. Performance counters are
a debugging feature, they are complex to manage and only available in a
limited way. Its also not trivial figuring out a way to trigger these
counters in order to get the data.

> All it does is to open the flood gates of external mempolicy management, which
> is wrong.

It does not open those doors since the modification of memory policies is
still not possible (IMHO this is one element of a series of problematic
"features" in the policy layer).

External memory policy management is a necessary feature for system
administration, batch process scheduling as well as for testing and
debugging a system.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/