Re: exit_mmap BUG_ON in 2.6.23

From: Hugh Dickins
Date: Sat May 19 2012 - 16:46:55 EST


On Fri, 18 May 2012, Sam Portolla wrote:
> [please cc samPortolla@xxxxxxxxx on your replies, not subscribed to the linux-kernel mailer]
>
> Hi, I have read the thread on same issue in 3.1:
> but this is happening on earlier GNU linux version 2.6.23 for x86_64,
> which does not have THP (I believe), nor it has huge_memory.c.
> Is there a fix one of you experts could supply?  Issue is not reproducible
> so far, but happened on a customer site. Some info below.
>
> kernel BUG at .../bfc/linux/kernel-2.6.x/mm/mmap.c:2049!
>
> Line 2049 is in exit_mmap():
>
> BUG_ON(mm->nr_ptes > (FIRST_USER_ADDRESS+PMD_SIZE-1)>>PMD_SHIFT);
>
>  RIP: 0010:[<ffffffff80277840>] [<ffffffff80277840>] exit_mmap+0xf0/0x100
> [snip]
>  Call Trace:
> [<ffffffff8022ee14>] mmput+0x44/0xd0
> [<ffffffff802340a1>] exit_mm+0x91/0x100
> [<ffffffff802347ea>] do_exit+0x17a/0x960
> [<ffffffff8023c4bc>] __dequeue_signal+0xec/0x1b0
> [<ffffffff80235048>] do_group_exit+0x38/0x90
> [<ffffffff8023e3c6>] get_signal_to_deliver+0x2d6/0x4b0
> [<ffffffff8020b69a>] do_notify_resume+0xaa/0x760
> [<ffffffff8020c818>] retint_signal+0x3d/0x85

I've checked back through old ChangeLogs, and (apart from a UserModeLinux
case) I don't see any fix for a BUG_ON(nr_ptes) issue in between 2.6.19
and the much later THP issue, which you're right to think cannot be yours.

But the 2.6.19 case, and one which a video driver writer had more recently,
were both caused by unrelated code zeroing beyond what it had allocated:
happening to zero part of a higher-level page table, making it impossible
for task exit to locate all the page tables (and pages) it had to free.

Though I can't be sure, these BUG_ON(nr_ptes) reports do seem perhaps
too infrequent to be caused by bad logic in mm itself: I suspect memory
corruption in your case too.

There's no clue here as to what the cause might be, I'm afraid.
Rebuilding your kernel with CONFIG_DEBUG_PAGEALLOC=y, and slab debugging
on, might shed more light: but that's probably not something you want to
get into on a customer site, for a problem only seen once or twice.

The best I can suggest is for you to change that BUG_ON to a WARN_ON,
so at least the kernel doesn't crash there, and you might gather more
information from each time it happens; but you'll probably leak pages,
and may very well crash soon for other reasons (e.g. when evicting an
inode cannot locate all the maps of its pages).

Hugh