[PATCH] docs/mm: add more warnings around page table access

From: Jann Horn
Date: Thu Nov 14 2024 - 16:13:02 EST


Make it clearer that holding the mmap lock in read mode is not enough
to traverse page tables, and that just having a stable VMA is not enough
to read PTEs.

Suggested-by: Matteo Rizzo <matteorizzo@xxxxxxxxxx>
Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
---
@akpm: Please don't put this in your tree before Lorenzo has replied.

@Lorenzo:
This is intended to go on top of your documentation patch.
If you think this is a sensible change, do you prefer to squash it into
your patch or do you prefer having akpm take this as a separate patch?
IDK what works better...
---
Documentation/mm/process_addrs.rst | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
index 1bf7ad010fc063d003bb857bb3b695a3eafa0b55..9bdf073d0c3ebea1707812508a309aa4a6163660 100644
--- a/Documentation/mm/process_addrs.rst
+++ b/Documentation/mm/process_addrs.rst
@@ -339,6 +339,16 @@ When **installing** page table entries, the mmap or VMA lock must be held to
keep the VMA stable. We explore why this is in the page table locking details
section below.

+.. warning:: Taking the mmap lock in read mode **is not sufficient** for
+ traversing page tables; you must also ensure that a VMA exists that
+ covers the range being accessed.
+ This ensures you can't race with concurrent page table removal
+ which happens with the mmap lock in read mode, in regions whose
+ VMAs are no longer present in the VMA tree.
+
+ (Alternatively, the mmap lock can be taken in write mode, but that
+ is heavy-handed and almost never the right choice.)
+
**Freeing** page tables is an entirely internal memory management operation and
has special requirements (see the page freeing section below for more details).

@@ -450,6 +460,9 @@ the time of writing of this document.
Locking Implementation Details
------------------------------

+.. warning:: Locking rules for PTE-level page tables are very different from
+ locking rules for page tables at other levels.
+
Page table locking details
--------------------------

@@ -470,8 +483,12 @@ additional locks dedicated to page tables:
These locks represent the minimum required to interact with each page table
level, but there are further requirements.

-Importantly, note that on a **traversal** of page tables, no such locks are
-taken. Whether care is taken on reading the page table entries depends on the
+Importantly, note that on a **traversal** of page tables, sometimes no such
+locks are taken. However, at the PTE level, at least concurrent page table
+deletion must be prevented (using RCU) and the page table must be mapped into
+high memory, see below.
+
+Whether care is taken on reading the page table entries depends on the
architecture, see the section on atomicity below.

Locking rules

---
base-commit: 1e96a63d3022403e06cdda0213c7849b05973cd5
change-id: 20241114-vma-docs-addition1-onv3-32df4e6dffcf

--
Jann Horn <jannh@xxxxxxxxxx>