Re: kernel BUG in page_try_dup_anon_rmap

From: Mike Kravetz
Date: Thu Oct 20 2022 - 17:59:35 EST


On 10/20/22 09:59, Mike Kravetz wrote:
> On 10/21/22 00:21, Wei Chen wrote:
> > Dear Vlastimil,
> >
> > Thank you for the reply. The bug persists in v6.0. Here is the
> > information. Luckily I got C reproducer this time.
>
> Ooh. Looks like the reproducer is doing a MADV_DONTNEED on a hugetlb mapping.
> That support was added somewhat recently (5.18). Not sure if it is related in
> any way. Have not looked at the code/implementation around write_protect_seq.

I verified that the new hugetlb MADV_DONTNEED is the root cause. :(

The reproducer calls madvise(MADV_DONTNEED) on the hugetlb mapping before
mapping any pages. madvise(MADV_DONTNEED) ends up calling:
zap_page_range
unmap_single_vma
__unmap_hugepage_range_final

__unmap_hugepage_range_final ends up clearing VM_MAYSHARE. This is
because it assumes the vma is going away and wants to prevent someone from
doing PMD sharing with the vma on it's way out. The causes confusion in
subsequent faults in the vma as sharing or private keys off VM_MAYSHARE.
We then end up with pages in the page table where page_mapping is NULL.

Somewhat good news is that I thought clearing of VM_MAYSHARE as done above
was kludgy and was able to remove it in 6.1 with the introduction of hugetlb
vma_lock for pmd sharing. So, should not be an issue in development
branches.

I'll come up with a way to fix for 5.18 to 6.0 kernels.
--
Mike Kravetz