Re: [PATCH] mm: check the return value of lookup_page_ext for all call sites

From: Shi, Yang
Date: Wed Jun 01 2016 - 16:40:57 EST


On 5/29/2016 11:11 PM, Minchan Kim wrote:
On Fri, May 27, 2016 at 11:16:41AM -0700, Shi, Yang wrote:

<snip>


If we goes this way, how to guarantee this race?

Thanks for pointing out this. It sounds reasonable. However, this
should be only possible to happen on 32 bit since just 32 bit
version page_is_idle() calls lookup_page_ext(), it doesn't do it on
64 bit.

And, such race condition should exist regardless of whether DEBUG_VM
is enabled or not, right?

rcu might be good enough to protect it.

A quick fix may look like:

diff --git a/include/linux/page_idle.h b/include/linux/page_idle.h
index 8f5d4ad..bf0cd6a 100644
--- a/include/linux/page_idle.h
+++ b/include/linux/page_idle.h
@@ -77,8 +77,12 @@ static inline bool
test_and_clear_page_young(struct page *page)
static inline bool page_is_idle(struct page *page)
{
struct page_ext *page_ext;
+
+ rcu_read_lock();
page_ext = lookup_page_ext(page);
+ rcu_read_unlock();
+
if (unlikely(!page_ext))
return false;

diff --git a/mm/page_ext.c b/mm/page_ext.c
index 56b160f..94927c9 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -183,7 +183,6 @@ struct page_ext *lookup_page_ext(struct page *page)
{
unsigned long pfn = page_to_pfn(page);
struct mem_section *section = __pfn_to_section(pfn);
-#if defined(CONFIG_DEBUG_VM) || defined(CONFIG_PAGE_POISONING)
/*
* The sanity checks the page allocator does upon freeing a
* page can reach here before the page_ext arrays are
@@ -195,7 +194,7 @@ struct page_ext *lookup_page_ext(struct page *page)
*/
if (!section->page_ext)
return NULL;
-#endif
+
return section->page_ext + pfn;
}

@@ -279,7 +278,8 @@ static void __free_page_ext(unsigned long pfn)
return;
base = ms->page_ext + pfn;
free_page_ext(base);
- ms->page_ext = NULL;
+ rcu_assign_pointer(ms->page_ext, NULL);
+ synchronize_rcu();

How does it fix the problem?
I cannot understand your point.

Assigning NULL pointer to page_Ext will be blocked until rcu_read_lock critical section is done, so the lookup and writing operations will be serialized. And, rcu_read_lock disables preempt too.

Yang


}

static int __meminit online_page_ext(unsigned long start_pfn,

Thanks,
Yang


kpageflags_read
stable_page_flags
page_is_idle
lookup_page_ext
section = __pfn_to_section(pfn)
offline_pages
memory_notify(MEM_OFFLINE)
offline_page_ext
ms->page_ext = NULL
section->page_ext + pfn


Thanks.