Re: [PATCH v3 3/3] Documentation/filesystems/proc.txt: document hugetlb RSS

From: Naoya Horiguchi
Date: Tue Aug 11 2015 - 20:04:56 EST

On Mon, Aug 10, 2015 at 05:44:54PM -0700, David Rientjes wrote:
> On Mon, 10 Aug 2015, Naoya Horiguchi wrote:
> > diff --git v4.2-rc4.orig/Documentation/filesystems/proc.txt v4.2-rc4/Documentation/filesystems/proc.txt
> > index 6f7fafde0884..cb8565e150ed 100644
> > --- v4.2-rc4.orig/Documentation/filesystems/proc.txt
> > +++ v4.2-rc4/Documentation/filesystems/proc.txt
> > @@ -168,6 +168,7 @@ For example, to get the status information of a process, all you have to do is
> > VmLck: 0 kB
> > VmHWM: 476 kB
> > VmRSS: 476 kB
> > + VmHugetlbRSS: 0 kB
> > VmData: 156 kB
> > VmStk: 88 kB
> > VmExe: 68 kB
> > @@ -230,6 +231,7 @@ Table 1-2: Contents of the status files (as of 4.1)
> > VmLck locked memory size
> > VmHWM peak resident set size ("high water mark")
> > VmRSS size of memory portions
> > + VmHugetlbRSS size of hugetlb memory portions
> > VmData size of data, stack, and text segments
> > VmStk size of data, stack, and text segments
> > VmExe size of text segment
> > @@ -440,8 +442,12 @@ indicates the amount of memory currently marked as referenced or accessed.
> > "Anonymous" shows the amount of memory that does not belong to any file. Even
> > a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE
> > and a page is modified, the file page is replaced by a private anonymous copy.
> > -"Swap" shows how much would-be-anonymous memory is also used, but out on
> > -swap.
> > +"Swap" shows how much would-be-anonymous memory is also used, but out on swap.
> > +Since 4.3, "RSS" contains the amount of mappings for hugetlb pages. Although
> > +RSS of hugetlb mappings is maintained separately from normal mappings
> > +(displayed in "VmHugetlbRSS" field of /proc/PID/status,) /proc/PID/smaps shows
> > +both mappings in "RSS" field. Userspace applications clearly distinguish the
> > +type of mapping with 'ht' flag in "VmFlags" field.
> >
> > "VmFlags" field deserves a separate description. This member represents the kernel
> > flags associated with the particular virtual memory area in two letter encoded
> My objection to adding hugetlb memory to the RSS field of /proc/pid/smaps
> still stands and can be addressed in the thread of the first patch. Since
> this includes wording that describes that change, then the objection would
> also cover that.

OK, I'll update this in accordance with the change on the first patch.

> With regard to adding VmHugetlbRSS, I think the change is fine, and I
> appreciate that you call it VmHugetlbRSS and not VmHugeRSS since that
> would be confused with thp.

I plan to rename the field, then the new name will/should be unconfusing
between thp and hugetlb.

> My only concern regarding VmHugetlbRSS would be extendability and whether
> we will eventually, or even today, want to differentiate between various
> hugetlb page sizes. For example, if 1GB hugetlb pages on x86 are a
> precious resource, then how do I determine which process has mapped it
> rather than 512 2MB hugetlb pages?

"KernelPageSize" field in /proc/PID/smaps is aware of hugetlb page sizes,
so I expected userspace to detect the size itself. But /proc/PID/status shows
only proccess-wide info, so userspace applications must read both of these
files to know the usage per hugepage size, which might be inconvenient.

One idea is to show the new field like "VmHugetlbRSS: 2x512kB 1x1GB" for
both of /proc/PID/{status,smaps}, which passes the full hugetlb info in a
single line so easier to parse and process. Or some other fields shows in
"kB", so "VmHugetlbRSS: 1052672 kB (2x512kB 1x1GB)" is possible for human

Thank you very much for the feedback, I'll repost soon, but any additional
comment is appreciated.