[PATCH v2 2/7] mm/mprotect: minor can_change_pte_writable() cleanups

From: David Hildenbrand
Date: Tue Nov 08 2022 - 12:48:24 EST


We want to replicate this code for handling PMDs soon.

(1) No need to crash the kernel, warning and rejecting is good enough. As
this will no longer get optimized out, drop the pte_write() check: no
harm would be done.

(2) Add a comment why PROT_NONE mapped pages are excluded.

(3) Add a comment regarding MAP_SHARED handling and why we rely on the
dirty bit in the PTE.

Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
---
mm/mprotect.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 86a28c0e190f..72aabffb7871 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -44,8 +44,10 @@ static inline bool can_change_pte_writable(struct vm_area_struct *vma,
{
struct page *page;

- VM_BUG_ON(!(vma->vm_flags & VM_WRITE) || pte_write(pte));
+ if (WARN_ON_ONCE(!(vma->vm_flags & VM_WRITE)))
+ return false;

+ /* Don't touch entries that are not even readable. */
if (pte_protnone(pte))
return false;

@@ -59,15 +61,22 @@ static inline bool can_change_pte_writable(struct vm_area_struct *vma,

if (!(vma->vm_flags & VM_SHARED)) {
/*
- * We can only special-case on exclusive anonymous pages,
- * because we know that our write-fault handler similarly would
- * map them writable without any additional checks while holding
- * the PT lock.
+ * Writable MAP_PRIVATE mapping: We can only special-case on
+ * exclusive anonymous pages, because we know that our
+ * write-fault handler similarly would map them writable without
+ * any additional checks while holding the PT lock.
*/
page = vm_normal_page(vma, addr, pte);
return page && PageAnon(page) && PageAnonExclusive(page);
}

+ /*
+ * Writable MAP_SHARED mapping: "clean" might indicate that the FS still
+ * needs a real write-fault for writenotify
+ * (see vma_wants_writenotify()). If "dirty", the assumption is that the
+ * FS was already notified and we can simply mark the PTE writable
+ * just like the write-fault handler would do.
+ */
return pte_dirty(pte);
}

--
2.38.1