Re: [RFC PATCH] Add /proc/<pid>/numa_vamaps for numa node information

From: Michal Hocko
Date: Fri Sep 14 2018 - 02:21:40 EST


On Fri 14-09-18 03:33:28, Jann Horn wrote:
> On Wed, Sep 12, 2018 at 10:43 PM prakash.sangappa
> <prakash.sangappa@xxxxxxxxxx> wrote:
> > On 05/09/2018 04:31 PM, Dave Hansen wrote:
> > > On 05/07/2018 06:16 PM, prakash.sangappa wrote:
> > >> It will be /proc/<pid>/numa_vamaps. Yes, the behavior will be
> > >> different with respect to seeking. Output will still be text and
> > >> the format will be same.
> > >>
> > >> I want to get feedback on this approach.
> > > I think it would be really great if you can write down a list of the
> > > things you actually want to accomplish. Dare I say: you need a
> > > requirements list.
> > >
> > > The numa_vamaps approach continues down the path of an ever-growing list
> > > of highly-specialized /proc/<pid> files. I don't think that is
> > > sustainable, even if it has been our trajectory for many years.
> > >
> > > Pagemap wasn't exactly a shining example of us getting new ABIs right,
> > > but it sounds like something along those is what we need.
> >
> > Just sent out a V2 patch. This patch simplifies the file content. It
> > only provides VA range to numa node id information.
> >
> > The requirement is basically observability for performance analysis.
> >
> > - Need to be able to determine VA range to numa node id information.
> > Which also gives an idea of which range has memory allocated.
> >
> > - The proc file /proc/<pid>/numa_vamaps is in text so it is easy to
> > directly view.
> >
> > The V2 patch supports seeking to a particular process VA from where
> > the application could read the VA to numa node id information.
> >
> > Also added the 'PTRACE_MODE_READ_REALCREDS' check when opening the
> > file /proc file as was indicated by Michal Hacko
>
> procfs files should use PTRACE_MODE_*_FSCREDS, not PTRACE_MODE_*_REALCREDS.

Out of my curiosity, what is the semantic difference? At least
kernel_move_pages uses PTRACE_MODE_READ_REALCREDS. Is this a bug?

--
Michal Hocko
SUSE Labs