Re: [PATCH v7] mm: Add PM_THP_MAPPED to /proc/pid/pagemap
From: Matthew Wilcox
Date: Tue Nov 23 2021 - 16:30:49 EST
On Tue, Nov 23, 2021 at 01:10:37PM -0800, Mina Almasry wrote:
> On Tue, Nov 23, 2021 at 12:51 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > On Mon, Nov 22, 2021 at 04:01:02PM -0800, Mina Almasry wrote:
> > > Add PM_THP_MAPPED MAPPING to allow userspace to detect whether a given virt
> > > address is currently mapped by a transparent huge page or not. Example
> > > use case is a process requesting THPs from the kernel (via a huge tmpfs
> > > mount for example), for a performance critical region of memory. The
> > > userspace may want to query whether the kernel is actually backing this
> > > memory by hugepages or not.
> >
> > So you want this bit to be clear if the memory is backed by a hugetlb
> > page?
> >
>
> Yes I believe so. I do not see value in telling the userspace that the
> virt address is backed by a hugetlb page, since if the memory is
> mapped by MAP_HUGETLB or is backed by a hugetlb file then the memory
> is backed by hugetlb pages and there is no vagueness from the kernel
> here.
>
> Additionally hugetlb interfaces are more size based rather than PMD or
> not. arm64 for example supports 64K, 2MB, 32MB and 1G 'huge' pages and
> it's an implementation detail that those sizes are mapped CONTIG PTE,
> PMD, CONITG PMD, and PUD respectively, and the specific mapping
> mechanism is typically not exposed to the userspace and might not be
> stable. Assuming pagemap_hugetlb_range() == PMD_MAPPED would not
> technically be correct.
What I've been trying to communicate over the N reviews of this
patch series is that *the same thing is about to happen to THPs*.
Only more so. THPs are going to be of arbitrary power-of-two size, not
necessarily sizes supported by the hardware. That means that we need to
be extremely precise about what we mean by "is this a THP?" Do we just
mean "This is a compound page?" Do we mean "this is mapped by a PMD?"
Or do we mean something else? And I feel like I haven't been able to
get that information out of you.