On 05/02/2018 02:33 PM, Andrew Morton wrote:
On Tue, 1 May 2018 22:58:06 -0700 Prakash Sangappa <prakash.sangappa@xxxxxxxxxx> wrote:I'm finding myself a little lost in figuring out what this does. Today,
For analysis purpose it is useful to have numa node information
corresponding mapped address ranges of the process. Currently
/proc/<pid>/numa_maps provides list of numa nodes from where pages are
allocated per VMA of the process. This is not useful if an user needs to
determine which numa node the mapped pages are allocated from for a
particular address range. It would have helped if the numa node information
presented in /proc/<pid>/numa_maps was broken down by VA ranges showing the
exact numa node from where the pages have been allocated.
numa_maps might us that a 3-page VMA has 1 page from Node 0 and 2 pages
from Node 1. We group *entirely* by VMA:
1000-4000 N0=1 N1=2
We don't want that. We want to tell exactly where each node's memory is
despite if they are in the same VMA, like this:
1000-2000 N1=1
2000-3000 N0=1
3000-4000 N1=1
So that no line of output ever has more than one node's memory. It
*appears* in this new file as if each contiguous range of memory from a
given node has its own VMA. Right?
This sounds interesting, but I've never found myself wanting this
information a single time that I can recall. I'd love to hear more.
Is this for debugging? Are apps actually going to *parse* this file?
How hard did you try to share code with numa_maps? Are you sure we
can't just replace numa_maps? VMAs are a kernel-internal thing and we
never promised to represent them 1:1 in our ABI.
Are we going to continue creating new files in /proc every time a tiny
new niche pops up? :)