Re: [PATCH] mm: fix page_mkclean_one (was: 2.6.19 file contentcorruption on ext3)

From: Linus Torvalds
Date: Sun Dec 24 2006 - 13:37:51 EST

On Sun, 24 Dec 2006, Linus Torvalds wrote:
> How about this particularly stupid diff? (please test with something that
> _would_ cause corruption normally).

Actually, here's an even more stupid diff, which actually to some degree
seems to capture the real problem better.

Peter, tell me I'm crazy, but with the new rules, the following condition
is a bug:

- shared mapping
- writable
- not already marked dirty in the PTE

because that combination means that the hardware can mark the PTE dirty
without us even realizing (and thus not marking the "struct page *"

(The above is actually a valid situation for IO mappings, but not for
"real" mappings. And IO mappings should never take page faults, I think).

So, with that in mind, I wrote this stupid patch (for 32-bit x86, since I
used my Mac Mini for testing ratehr than my main machine - but the x86-64
version should be pretty much identcal)..

And you know what, Peter? It triggers for me. I get

WARNING at mm/memory.c:2274 do_no_page()
[<c0103d4a>] show_trace_log_lvl+0x1a/0x2f
[<c010436c>] show_trace+0x12/0x14
[<c01043f0>] dump_stack+0x16/0x18
[<c0159790>] __handle_mm_fault+0x38d/0x919
[<c011c8c4>] do_page_fault+0x1ff/0x507
[<c03fabcc>] error_code+0x7c/0x84

which seems to say that do_no_page() can be used to insert shared and
non-dirty, but still writable, pages.

But maybe my patch is just bogus, and I didn't think it through.

Peter, I realize it's Christmas Eve, but let's face it, Santa appreciates
good boys and girls, and we all want tons of loot. So please be good, and
waste some time looking at this and tell me why I'm either wrong, or
there's a real smoking gun here.. ;)


diff --git a/include/asm-i386/pgtable.h b/include/asm-i386/pgtable.h
index e6a4723..1389bb7 100644
--- a/include/asm-i386/pgtable.h
+++ b/include/asm-i386/pgtable.h
@@ -494,7 +494,13 @@ do { \
* The i386 doesn't have any external MMU info: the kernel page
* tables contain all the necessary information.
-#define update_mmu_cache(vma,address,pte) do { } while (0)
+#define bad_shared_pte(pte) (pte_write(pte) && !pte_dirty(pte))
+#define update_mmu_cache(vma,address,pte) do { \
+ static int __cnt; \
+ WARN_ON(((vma)->vm_flags & VM_SHARED) \
+ && bad_shared_pte(pte) \
+ && ++__cnt < 5); \
+} while (0)
#endif /* !__ASSEMBLY__ */

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at