Re: [BUG] Bad page flags when process using mlock()ed memory exits
From: Johannes Weiner
Date: Fri Jun 19 2009 - 13:45:02 EST
On Fri, Jun 19, 2009 at 02:11:21PM +1000, Peter Chubb wrote:
>
> In recent kernels I've been seeing many mesages of the form:
>
> BUG: Bad page state in process reiserfsck pfn:79c58
> page:c3d03b00 flags:8050000c count:0 mapcount:0 mapping:(null) index:8095
> Pid: 3927, comm: reiserfsck Not tainted 2.6.30-test-05456-gda456f1 #60
> Call Trace:
> [<c134a67c>] ? printk+0xf/0x13
> [<c10774dc>] bad_page+0xc9/0xe2
> [<c1078041>] free_hot_cold_page+0x5c/0x204
> [<c1078206>] __pagevec_free+0x1d/0x25
> [<c107ac3e>] release_pages+0x14e/0x18e)
> [<c108ef8a>] free_pages_and_swap_cache+0x69/0x82
> [<c1089458>] exit_mmap+0xf6/0x11f
> [<c102afcd>] mmput+0x39/0xaf
> [<c102e534>] exit_mm+0xe5/0xed
> [<c102fa66>] do_exit+0x13f/0x578
> [<c102fefd>] do_group_exit+0x5e/0x85
> [<c102ff37>] sys_exit_group+0x13/0x17
> [<c10031ef>] sysenter_do_call+0x12/0x3c
> Disabling lock debugging due to kernel taint
>
> This appears to have been introduced by patch
> da456f14d2f2d7350f2b9440af79c85a34c7eed5
> page allocator: do not disable interrupts in free_page_mlock()
>
> That patch removed the free_page_mlock() from free_pages_check(), so
> if free_hot_cold_page() is called on an Mlocked page (e.g., if a
> process that used mlock() calls exit()) free_pages_check() will always
> barf, whereas before it would just unlock the page.
I prepared a fix, thanks for chasing it down.
Mel, to keep this simple I just used the atomic test-clear, but if I
am not mistaken we should not need any atomicity here, so we could
probably add a __TestClearPage version and use this instead...?
---
>From 493b74e8615db4e3323b5b169b0b8265dfd7c3f4 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@xxxxxxxxxxx>
Date: Fri, 19 Jun 2009 19:30:56 +0200
Subject: [patch] mm: page_alloc: clear PG_locked before checking flags on free
da456f1 "page allocator: do not disable interrupts in free_page_mlock()" moved
the PG_mlocked clearing after the flag sanity checking which makes mlocked
pages always trigger 'bad page'. Fix this by clearing the bit up front.
Reported-by: Peter Chubb <peter.chubb@xxxxxxxxxxxx>
Debugged-by: Peter Chubb <peter.chubb@xxxxxxxxxxxx>
Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: Mel Gorman <mel@xxxxxxxxx>
---
mm/page_alloc.c | 9 ++++-----
1 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6f0753f..30d5093 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -488,7 +488,6 @@ static inline void __free_one_page(struct page *page,
*/
static inline void free_page_mlock(struct page *page)
{
- __ClearPageMlocked(page);
__dec_zone_page_state(page, NR_MLOCK);
__count_vm_event(UNEVICTABLE_MLOCKFREED);
}
@@ -558,7 +557,7 @@ static void __free_pages_ok(struct page *page, unsigned int order)
unsigned long flags;
int i;
int bad = 0;
- int clearMlocked = PageMlocked(page);
+ int wasMlocked = TestClearPageMlocked(page);
kmemcheck_free_shadow(page, order);
@@ -576,7 +575,7 @@ static void __free_pages_ok(struct page *page, unsigned int order)
kernel_map_pages(page, 1 << order, 0);
local_irq_save(flags);
- if (unlikely(clearMlocked))
+ if (unlikely(wasMlocked))
free_page_mlock(page);
__count_vm_events(PGFREE, 1 << order);
free_one_page(page_zone(page), page, order,
@@ -1022,7 +1021,7 @@ static void free_hot_cold_page(struct page *page, int cold)
struct zone *zone = page_zone(page);
struct per_cpu_pages *pcp;
unsigned long flags;
- int clearMlocked = PageMlocked(page);
+ int wasMlocked = TestClearPageMlocked(page);
kmemcheck_free_shadow(page, 0);
@@ -1041,7 +1040,7 @@ static void free_hot_cold_page(struct page *page, int cold)
pcp = &zone_pcp(zone, get_cpu())->pcp;
set_page_private(page, get_pageblock_migratetype(page));
local_irq_save(flags);
- if (unlikely(clearMlocked))
+ if (unlikely(wasMlocked))
free_page_mlock(page);
__count_vm_event(PGFREE);
--
1.6.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/