Re: [PATCH] mm/huge_memory: fix the memory leak due to the race

From: Kirill A. Shutemov
Date: Tue Jun 21 2016 - 12:32:05 EST


On Tue, Jun 21, 2016 at 11:19:07PM +0800, zhong jiang wrote:
> On 2016/6/21 22:37, Kirill A. Shutemov wrote:
> > On Tue, Jun 21, 2016 at 10:05:56PM +0800, zhongjiang wrote:
> >> From: zhong jiang <zhongjiang@xxxxxxxxxx>
> >>
> >> with great pressure, I run some test cases. As a result, I found
> >> that the THP is not freed, it is detected by check_mm().
> >>
> >> BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
> >>
> >> Consider the following race :
> >>
> >> CPU0 CPU1
> >> __handle_mm_fault()
> >> wp_huge_pmd()
> >> do_huge_pmd_wp_page()
> >> pmdp_huge_clear_flush_notify()
> >> (pmd_none = true)
> >> exit_mmap()
> >> unmap_vmas()
> >> zap_pmd_range()
> >> pmd_none_or_trans_huge_or_clear_bad()
> >> (result in memory leak)
> >> set_pmd_at()
> >>
> >> because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
> >> and it make the pmd entry to be null. Therefore, The memory leak can occur.
> >>
> >> The patch fix the scenario that the pmd entry can lead to be null.
> > I don't think the scenario is possible.
> >
> > exit_mmap() called when all mm users have gone, so no parallel threads
> > exist.
> >
> Forget this patch. It 's my fault , it indeed don not exist.
> But I hit the following problem. we can see the memory leak when the process exit.
>
>
> Any suggestion will be apprecaited.

Could you try this:

http://lkml.kernel.org/r/20160621150433.GA7536@xxxxxxxxxxxxxxxxxx

--
Kirill A. Shutemov