On Sun, 24 Dec 2006, Linus Torvalds wrote:
Peter, tell me I'm crazy, but with the new rules, the following condition is a bug:
- shared mapping
- writable
- not already marked dirty in the PTE
Ok, so how about this diff.
I'm actually feeling good about this one. It really looks like "do_no_page()" was simply buggy, and that this explains everything.
Please please please test. Throw all the other patches away (with the possible exception of the "update_mmu_cache()" sanity checker, which is still interesting in case some _other_ place does this too).
Don't do the "wait_on_page_writeback()" thing, because it changes timings and might hide thngs for the wrong reasons. Just apply this on top of a known failing kernel, and test.
Linus
---
diff --git a/mm/memory.c b/mm/memory.c
index 563792f..cf429c4 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2247,21 +2249,23 @@ retry:
if (pte_none(*page_table)) {
flush_icache_page(vma, new_page);
entry = mk_pte(new_page, vma->vm_page_prot);
- if (write_access)
- entry = maybe_mkwrite(pte_mkdirty(entry), vma);
- set_pte_at(mm, address, page_table, entry);
if (anon) {
inc_mm_counter(mm, anon_rss);
lru_cache_add_active(new_page);
page_add_new_anon_rmap(new_page, vma, address);
+ if (write_access)
+ entry = maybe_mkwrite(pte_mkdirty(entry), vma);
} else {
inc_mm_counter(mm, file_rss);
page_add_file_rmap(new_page);
+ entry = pte_wrprotect(entry);
if (write_access) {
dirty_page = new_page;
get_page(dirty_page);
+ entry = maybe_mkwrite(pte_mkdirty(entry), vma);
}
}
+ set_pte_at(mm, address, page_table, entry);
} else {
/* One of our sibling threads was faster, back out. */
page_cache_release(new_page);