[PATCH v5 09/17] mm/madvise: Handle COW-ed PTE with madvise()

From: Chih-En Lin
Date: Fri Apr 14 2023 - 10:26:47 EST


Break COW PTE if madvise() modify the pte entry of COW-ed PTE.
Following are the list of flags which need to break COW PTE. However,
like MADV_HUGEPAGE and MADV_MERGEABLE, we should handle it respectively.

- MADV_DONTNEED: It calls to zap_page_range() which already be handled.
- MADV_FREE: It uses walk_page_range() with madvise_free_pte_range() to
free the page by itself, so add break_cow_pte().
- MADV_REMOVE: Same as MADV_FREE, it remove the page by itself, so add
break_cow_pte_range().
- MADV_COLD: Similar to MAD_FREE, break COW PTE before pageout.
- MADV_POPULATE: Let GUP deal with it.

Signed-off-by: Chih-En Lin <shiyn.lin@xxxxxxxxx>
---
mm/madvise.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/mm/madvise.c b/mm/madvise.c
index 340125d08c03..71176edb751e 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -425,6 +425,9 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
if (pmd_trans_unstable(pmd))
return 0;
#endif
+ if (break_cow_pte(vma, pmd, addr))
+ return 0;
+
tlb_change_page_size(tlb, PAGE_SIZE);
orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
flush_tlb_batched_pending(mm);
@@ -625,6 +628,10 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
if (pmd_trans_unstable(pmd))
return 0;

+ /* We should only allocate PTE. */
+ if (break_cow_pte(vma, pmd, addr))
+ goto next;
+
tlb_change_page_size(tlb, PAGE_SIZE);
orig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
flush_tlb_batched_pending(mm);
@@ -984,6 +991,12 @@ static long madvise_remove(struct vm_area_struct *vma,
if ((vma->vm_flags & (VM_SHARED|VM_WRITE)) != (VM_SHARED|VM_WRITE))
return -EACCES;

+ error = break_cow_pte_range(vma, start, end);
+ if (error < 0)
+ return error;
+ else if (error > 0)
+ return -ENOMEM;
+
offset = (loff_t)(start - vma->vm_start)
+ ((loff_t)vma->vm_pgoff << PAGE_SHIFT);

--
2.34.1