Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linux 2.6.34-rc3)

From: Rik van Riel
Date: Fri Apr 02 2010 - 18:03:29 EST


On 04/02/2010 02:37 PM, Linus Torvalds wrote:
On Fri, 2 Apr 2010, Andrew Morton wrote:
On Fri, 2 Apr 2010 11:09:14 -0700 (PDT) Linus Torvalds<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:


I think this is likely due to the new scalable anon_vma linking by Rik.

Similar to https://bugzilla.kernel.org/show_bug.cgi?id=15680

Yup, looks like the same thing, except that bugzilla entry was due to
swapping rather than hibernation and memory shrinking. But same end
result, just different reasons for why we were trying to shrink the page
lists.

Interesting that it is a null pointer dereference, given
that we do not zero out the anon_vma_chain structs before
freeing them.

Page_referenced_anon() takes the anon_vma->lock before
walking the list. The three places where we modify the
anon_vma_chain->same_anon_vma list, we also hold the
lock.

No doubt something in mm/ is doing something silly, but
I have not found anything yet :(

If I had to guess, I'd say maybe we got one of the
mprotect & vma_adjust cases wrong. Maybe a page stayed
around in the LRU (and in a process?) after its anon_vma
already got freed?

There has to be a reason why a very heavy AIM7 workload
and some other stress tests did not trigger it, but a few
people are able to trigger it on their systems...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/