[PATCH] numa: fix /proc/<pid>/numa_maps for hugetlbfs on s390

From: Michael Holzheu
Date: Mon Jan 25 2016 - 11:31:13 EST


When working with hugetlbfs ptes (which are actually pmds) is not
valid to directly use pte functions like pte_present() because the
hardware bit layout of pmds and ptes can be different. This is the
case on s390. Therefore we have to convert the hugetlbfs ptes first
into a valid pte encoding with huge_ptep_get().

Currently the /proc/<pid>/numa_maps code uses hugetlbfs ptes without
huge_ptep_get(). On s390 this leads to the following two problems:

1) The pte_present() function returns false (instead of true) for
PROT_NONE hugetlb ptes. Therefore PROT_NONE vmas are missing
completely in the "numa_maps" output.

2) The pte_dirty() function always returns false for all hugetlb ptes.
Therefore these pages are reported as "mapped=xxx" instead of
"dirty=xxx".

Therefore use huge_ptep_get() to correctly convert the hugetlb ptes.

Reviewed-by: Gerald Schaefer <gerald.schaefer@xxxxxxxxxx>
Signed-off-by: Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx>
---
fs/proc/task_mmu.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 85d16c6..4a0c31f 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1552,18 +1552,19 @@ static int gather_pte_stats(pmd_t *pmd, unsigned long addr,
static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask,
unsigned long addr, unsigned long end, struct mm_walk *walk)
{
+ pte_t huge_pte = huge_ptep_get(pte);
struct numa_maps *md;
struct page *page;

- if (!pte_present(*pte))
+ if (!pte_present(huge_pte))
return 0;

- page = pte_page(*pte);
+ page = pte_page(huge_pte);
if (!page)
return 0;

md = walk->private;
- gather_stats(page, md, pte_dirty(*pte), 1);
+ gather_stats(page, md, pte_dirty(huge_pte), 1);
return 0;
}

--
2.3.9