Re: [PATCH] mm: replace is_zero_pfn with is_huge_zero_pmd for thp

From: Gerald Schaefer
Date: Mon Aug 26 2019 - 11:09:52 EST


On Mon, 26 Aug 2019 06:18:58 -0700
Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:

> Why did you not cc Gerald who wrote the patch? You can't just
> run get_maintainers.pl and call it good.
>
> On Sun, Aug 25, 2019 at 02:06:21PM -0600, Yu Zhao wrote:
> > For hugely mapped thp, we use is_huge_zero_pmd() to check if it's
> > zero page or not.
> >
> > We do fill ptes with my_zero_pfn() when we split zero thp pmd, but
> > this is not what we have in vm_normal_page_pmd().
> > pmd_trans_huge_lock() makes sure of it.
> >
> > This is a trivial fix for /proc/pid/numa_maps, and AFAIK nobody
> > complains about it.
> >
> > Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx>
> > ---
> > mm/memory.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index e2bb51b6242e..ea3c74855b23 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -654,7 +654,7 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
> >
> > if (pmd_devmap(pmd))
> > return NULL;
> > - if (is_zero_pfn(pfn))
> > + if (is_huge_zero_pmd(pmd))
> > return NULL;
> > if (unlikely(pfn > highest_memmap_pfn))
> > return NULL;
> > --
> > 2.23.0.187.g17f5b7556c-goog
> >

Looks good to me. The "_pmd" versions for can_gather_numa_stats() and
vm_normal_page() were introduced to avoid using pte_present/dirty() on
pmds, which is not affected by this patch.

In fact, for vm_normal_page_pmd() I basically copied most of the code
from vm_normal_page(), including the is_zero_pfn(pfn) check, which does
look wrong to me now. Using is_huge_zero_pmd() should be correct.

Maybe the description could also mention the symptom of this bug?
I would assume that it affects anon/dirty accounting in gather_pte_stats(),
for huge mappings, if zero page mappings are not correctly recognized.

Regards,
Gerald