Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linux2.6.34-rc3)

From: Minchan Kim
Date: Sun Apr 04 2010 - 12:13:31 EST

Hi, Rik.

On Fri, 2010-04-02 at 18:01 -0400, Rik van Riel wrote:
> On 04/02/2010 02:37 PM, Linus Torvalds wrote:
> > On Fri, 2 Apr 2010, Andrew Morton wrote:
> >> On Fri, 2 Apr 2010 11:09:14 -0700 (PDT) Linus Torvalds<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >>
> >>>
> >>> I think this is likely due to the new scalable anon_vma linking by Rik.
> >>
> >> Similar to
> >
> > Yup, looks like the same thing, except that bugzilla entry was due to
> > swapping rather than hibernation and memory shrinking. But same end
> > result, just different reasons for why we were trying to shrink the page
> > lists.
> Interesting that it is a null pointer dereference, given
> that we do not zero out the anon_vma_chain structs before
> freeing them.
> Page_referenced_anon() takes the anon_vma->lock before
> walking the list. The three places where we modify the
> anon_vma_chain->same_anon_vma list, we also hold the
> lock.
> No doubt something in mm/ is doing something silly, but
> I have not found anything yet :(
> If I had to guess, I'd say maybe we got one of the
> mprotect & vma_adjust cases wrong. Maybe a page stayed
> around in the LRU (and in a process?) after its anon_vma
> already got freed?

While I review the code again due to this BUG, I found some strange

In anon_vma_fork, if anon_vma_clone is successful but anon_vma_alloc is
failed, what happens? Parent VMA's anon_vmas have anon_vma_chain which
has vma which is destroyed.
I couldn't find any clean routine to remove this garbage.
I am missing something?

But I think it isn't related to this bug because oops point is not
vma_address but

Kind regards,
Minchan Kim

