[PATCH] mm: hugepage: mark splitted page dirty when needed

From: Peter Xu
Date: Tue Sep 04 2018 - 03:55:24 EST


When splitting a huge page, we should set all small pages as dirty if
the original huge page has the dirty bit set before. Otherwise we'll
lose the original dirty bit.

CC: Andrea Arcangeli <aarcange@xxxxxxxxxx>
CC: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
CC: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
CC: Michal Hocko <mhocko@xxxxxxxx>
CC: Zi Yan <zi.yan@xxxxxxxxxxxxxx>
CC: Huang Ying <ying.huang@xxxxxxxxx>
CC: Dan Williams <dan.j.williams@xxxxxxxxx>
CC: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
CC: "JÃrÃme Glisse" <jglisse@xxxxxxxxxx>
CC: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
CC: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>
CC: Souptick Joarder <jrdr.linux@xxxxxxxxx>
CC: linux-mm@xxxxxxxxx
CC: linux-kernel@xxxxxxxxxxxxxxx
Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
---

To the reviewers: I'm new to the mm world so sorry if this patch is
making silly mistakes, however it did solve a problem for me when
testing with a customized Linux tree mostly based on Andrea's userfault
write-protect work. Without the change, my customized QEMU/tcg tree
will not be able to do correct UFFDIO_WRITEPROTECT and then QEMU will
get a SIGBUS when faulting multiple times. With the change (or of
course disabling THP) then UFFDIO_WRITEPROTECT will be able to correctly
resolve the write protections then it runs well. Any comment would be
welcomed. TIA.
---
mm/huge_memory.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index c3bc7e9c9a2a..0754a16923d5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2176,6 +2176,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
entry = pte_mkold(entry);
if (soft_dirty)
entry = pte_mksoft_dirty(entry);
+ if (dirty)
+ entry = pte_mkdirty(entry);
}
pte = pte_offset_map(&_pmd, addr);
BUG_ON(!pte_none(*pte));
--
2.17.1