Re: 2.6.16-rc1-mm3
From: Nick Piggin
Date: Thu Jan 26 2006 - 14:45:52 EST
Michal Piotrowski wrote:
Hi,
On 25/01/06, Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
Hi,
Michal Piotrowski wrote:
------------[ cut here ]------------
kernel BUG at /usr/src/linux-mm/include/linux/mm.h:302!
invalid opcode: 0000 [#1]
PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /class/vc/vcsa7/dev
Modules linked in: binfmt_misc thermal fan processor ipv6 w83627hf
hwmon_vid hwmon i2c_isa snd_intel8x0 snd_ac97_codec snd_ac97_bus
sk98lin snd_pcm_oss snd_mixer_oss skge intel_agp snd_pcm snd_timer snd
soundcore i2c_i801 parport_pc parport snd_page_alloc 8250_pnp 8250
serial_core agpgart rtc ide_cd cdrom hw_random unix
CPU: 0
EIP: 0060:[<b013fe81>] Not tainted VLI
EFLAGS: 00210246 (2.6.16-rc1-mm3 #1)
EIP is at release_pages+0x33/0x15e
Is it repeatable?
If so, I'd imagine it must be a specific driver page which is not properly
refcounted somewhere. A bug in generic code would have shown up elsewhere
by now.
Can you try something like the attached patch and see what it gives you?
Thanks, it confirms my suspicions.
Can you try the following patch, please?
It appears the warnings were brought out by my improvement to
the put_page_testzero debugging code (which previously did not
check that we might be attempting to free a constituent compound
page).
Can you test the following patch please?
--
SUSE Labs, Novell Inc.
Index: linux-2.6/include/linux/mm.h
===================================================================
--- linux-2.6.orig/include/linux/mm.h
+++ linux-2.6/include/linux/mm.h
@@ -15,6 +15,7 @@
#include <linux/prio_tree.h>
#include <linux/fs.h>
#include <linux/mutex.h>
+#include <linux/kallsyms.h>
struct mempolicy;
struct anon_vma;
@@ -264,6 +265,8 @@ struct page {
void *virtual; /* Kernel virtual address (NULL if
not kmapped, ie. highmem) */
#endif /* WANT_PAGE_VIRTUAL */
+
+ void *debug;
};
#define page_private(page) ((page)->private)
@@ -294,8 +297,14 @@ struct page {
*/
static inline int put_page_testzero(struct page *page)
{
- BUG_ON(atomic_read(&page->_count) == 0);
- return atomic_dec_and_test(&page->_count);
+ if (unlikely(atomic_read(&page->_count) == 0)) {
+ printk(KERN_WARNING "put_page_testzero found free page (flags = %lx)\n", page->flags);
+ if (page->debug)
+ print_symbol(KERN_WARNING "nopage is %s\n", (unsigned long)page->debug);
+ WARN_ON(1);
+ return 0;
+ } else
+ return atomic_dec_and_test(&page->_count);
}
/*
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -2056,6 +2056,8 @@ retry:
if (new_page == NOPAGE_OOM)
return VM_FAULT_OOM;
+ new_page->debug = (struct address_space *)vma->vm_ops->nopage;
+
/*
* Should we do an early C-O-W break?
*/
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -521,6 +521,8 @@ static int prep_new_page(struct page *pa
if (PageReserved(page))
return 1;
+ page->debug = NULL;
+
page->flags &= ~(1 << PG_uptodate | 1 << PG_error |
1 << PG_referenced | 1 << PG_arch_1 |
1 << PG_checked | 1 << PG_mappedtodisk);