[PATCH] Don't touch single threaded PTEs which are on the right node

From: Andi Kleen
Date: Thu Oct 13 2016 - 14:13:50 EST

From: Andi Kleen <ak@xxxxxxxxxxxxxxx>

We had some problems with pages getting unmapped in single threaded
affinitized processes. It was tracked down to NUMA scanning.

In this case it doesn't make any sense to unmap pages if the
process is single threaded and the page is already on the
node the process is running on.

Add a check for this case into the numa protection code,
and skip unmapping if true.

In theory the process could be migrated later, but we
will eventually rescan and unmap and migrate then.

In theory this could be made more fancy: remembering this
state per process or even whole mm. However that would
need extra tracking and be more complicated, and the
simple check seems to work fine so far.

v2: Only do it for private VMAs. Move most of check out of
v3: Minor updates from Mel. Change code layout.
Acked-by: Mel Gorman <mgorman@xxxxxxx>
Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
mm/mprotect.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index a4830f0325fe..11b8857c3437 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -68,11 +68,17 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
pte_t *pte, oldpte;
spinlock_t *ptl;
unsigned long pages = 0;
+ int target_node = NUMA_NO_NODE;

pte = lock_pte_protection(vma, pmd, addr, prot_numa, &ptl);
if (!pte)
return 0;

+ /* Get target node for single threaded private VMAs */
+ if (prot_numa && !(vma->vm_flags & VM_SHARED) &&
+ atomic_read(&vma->vm_mm->mm_users) == 1)
+ target_node = numa_node_id();
do {
oldpte = *pte;
@@ -94,6 +100,13 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
/* Avoid TLB flush if possible */
if (pte_protnone(oldpte))
+ /*
+ * Don't mess with PTEs if page is already on the node
+ * a single-threaded process is running on.
+ */
+ if (target_node == page_to_nid(page))
+ continue;

ptent = ptep_modify_prot_start(mm, addr, pte);