Re: [PATCH v3 3/3] proc: add kpageidle file

From: Vladimir Davydov
Date: Fri May 08 2015 - 05:56:27 EST

On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote:
> So, I guess once below compiler optimization happens in __page_set_anon_rmap,
> it could be corrupt in page_refernced.
> __page_set_anon_rmap:
> page->mapping = (struct address_space *) anon_vma;
> page->mapping = (struct address_space *)((void *)page_mapping + PAGE_MAPPING_ANON);
> Because page_referenced checks it with PageAnon which has no memory barrier.
> So if above compiler optimization happens, page_referenced can pass the anon
> page in rmap_walk_file, not ramp_walk_anon. It's my theory. :)


If such splits were possible, we would have bugs all over the kernel
IMO. An example is do_wp_page() vs shrink_active_list(). In do_wp_page()
we can call page_move_anon_rmap(), which sets page->mapping in exactly
the same fashion as above-mentioned __page_set_anon_rmap():

anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
page->mapping = (struct address_space *) anon_vma;

The page in question may be on an LRU list, because nowhere in
do_wp_page() we remove it from the list, neither do we take any LRU
related locks. The page is locked, that's true, but shrink_active_list()
calls page_referenced() on an unlocked page, so according to your logic
they can race with the latter receiving a page with page->mapping equal
to anon_vma w/o PAGE_MAPPING_ANON bit set:

---- ----
do_wp_page shrink_active_list
lock_page page_referenced
PageAnon->yes, so skip trylock_page
page->mapping = anon_vma
page->mapping = page->mapping+PAGE_MAPPING_ANON

However, this does not happen.

