Re: 2.6.9-rc2: "kernel BUG at mm/rmap.c:473!"

From: Hugh Dickins
Date: Sun Sep 19 2004 - 07:09:28 EST


On Sat, 18 Sep 2004, Mike Kirk wrote:
> Not sure what this means: but the system kept running and I only lost a
> bzip2 process: 2.6.9-rc2 w/preempt AMD 2700+ on A7N8X motherboard, 1GB
> DDR400:
> ==============================
> kernel BUG at mm/rmap.c:473!
> EIP is at page_remove_rmap+0x29/0x40

Was there a "Bad page state" message and backtrace shortly before this?
I say "shortly" because I don't suppose bzip2 had been running for hours,
I'd expect the underlying error to have occurred while it was running.

BUG_ON(page_mapcount(page) < 0);

Previous incidents of this BUG (or its antecedents: the mapcount has
recently changed to atomic from guarded by spinlock) have been after
something elsewhere has freed a page it no longer owned, which has
meanwhile got mapped into process address space.

If that's the case this time around, then I hope the bad_page backtrace
will shed light on what's freeing that shouldn't. But if there was no
"Bad page state" message before, then I'll have to start worrying about
rmap mapcount consistency.

(The page_remove_rmap BUG follows as a consequence of bad_page resetting
the mapcount on such a page. This is unsatisfactory: that code remote
from the cause should force a BUG, whereas code nearer the cause try to
continue without forcing a BUG. Just removing the BUG from mm/rmap.c
would let me off the hook, but seems irresponsible. Adding a BUG into
bad_page would revert an intentional relaxation. I don't know.)

Hugh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/