Re: [RFC PATCH] Add /proc/<pid>/numa_vamaps for numa node information

From: Anshuman Khandual
Date: Thu May 03 2018 - 04:46:21 EST


On 05/03/2018 03:58 AM, Dave Hansen wrote:
> On 05/02/2018 02:33 PM, Andrew Morton wrote:
>> On Tue, 1 May 2018 22:58:06 -0700 Prakash Sangappa <prakash.sangappa@xxxxxxxxxx> wrote:
>>> For analysis purpose it is useful to have numa node information
>>> corresponding mapped address ranges of the process. Currently
>>> /proc/<pid>/numa_maps provides list of numa nodes from where pages are
>>> allocated per VMA of the process. This is not useful if an user needs to
>>> determine which numa node the mapped pages are allocated from for a
>>> particular address range. It would have helped if the numa node information
>>> presented in /proc/<pid>/numa_maps was broken down by VA ranges showing the
>>> exact numa node from where the pages have been allocated.
>
> I'm finding myself a little lost in figuring out what this does. Today,
> numa_maps might us that a 3-page VMA has 1 page from Node 0 and 2 pages
> from Node 1. We group *entirely* by VMA:
>
> 1000-4000 N0=1 N1=2
>
> We don't want that. We want to tell exactly where each node's memory is
> despite if they are in the same VMA, like this:
>
> 1000-2000 N1=1
> 2000-3000 N0=1
> 3000-4000 N1=1

I am kind of wondering on a big memory system how many lines of output
we might have for a large (consuming lets say 80 % of system RAM) VMA
in interleave policy. Is not that a problem ?