Re: crash during oom reaper (was: Re: [PATCH 4/4] [RFC!] mm: 'struct mm_struct' reference counting debugging)
From: Kirill A. Shutemov
Date: Fri Dec 16 2016 - 05:46:25 EST
On Fri, Dec 16, 2016 at 11:11:13AM +0100, Michal Hocko wrote:
> On Fri 16-12-16 10:43:52, Vegard Nossum wrote:
> [...]
> > I don't think it's a bug in the OOM reaper itself, but either of the
> > following two patches will fix the problem (without my understand how or
> > why):
> >
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index ec9f11d4f094..37b14b2e2af4 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -485,7 +485,7 @@ static bool __oom_reap_task_mm(struct task_struct *tsk,
> > struct mm_struct *mm)
> > */
> > mutex_lock(&oom_lock);
> >
> > - if (!down_read_trylock(&mm->mmap_sem)) {
> > + if (!down_write_trylock(&mm->mmap_sem)) {
>
> __oom_reap_task_mm is basically the same thing as MADV_DONTNEED and that
> doesn't require the exlusive mmap_sem. So this looks correct to me.
BTW, shouldn't we filter out all VM_SPECIAL VMAs there? Or VM_PFNMAP at
least.
MADV_DONTNEED doesn't touch VM_PFNMAP, but I don't see anything matching
on __oom_reap_task_mm() side.
Other difference is that you use unmap_page_range() witch doesn't touch
mmu_notifiers. MADV_DONTNEED goes via zap_page_range(), which invalidates
the range. Not sure if it can make any difference here.
--
Kirill A. Shutemov