[PATCH] mm, page_vma_mapped: Fix pointer arithmetics in check_pte()

From: Kirill A. Shutemov
Date: Thu Jan 18 2018 - 10:24:07 EST


Tetsuo reported random crashes under memory pressure on 32-bit x86
system and tracked down to change that introduced
page_vma_mapped_walk().

The root cause of the issue is the faulty pointer math in check_pte().
As ->pte may point to an arbitrary page we have to check that they are
belong to the section before doing math. Otherwise it may lead to weird
results.

It wasn't noticed until now as mem_map[] is virtually contiguous on flatmem or
vmemmap sparsemem. Pointer arithmetic just works against all 'struct page'
pointers. But with classic sparsemem, it doesn't.

Let's restructure code a bit and add necessary check.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
Cc: stable@xxxxxxxxxxxxxxx
---
mm/page_vma_mapped.c | 66 +++++++++++++++++++++++++++++++++++-----------------
1 file changed, 45 insertions(+), 21 deletions(-)

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index d22b84310f6d..de195dcdfbd8 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -30,8 +30,28 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw)
return true;
}

+/**
+ * check_pte - check if @pvmw->page is mapped at the @pvmw->pte
+ *
+ * page_vma_mapped_walk() found a place where @pvmw->page is *potentially*
+ * mapped. check_pte() has to validate this.
+ *
+ * @pvmw->pte may point to empty PTE, swap PTE or PTE pointing to arbitrary
+ * page.
+ *
+ * If PVMW_MIGRATION flag is set, returns true if @pvmw->pte contains migration
+ * entry that points to @pvmw->page or any subpage in case of THP.
+ *
+ * If PVMW_MIGRATION flag is not set, returns true if @pvmw->pte points to
+ * @pvmw->page or any subpage in case of THP.
+ *
+ * Otherwise, return false.
+ *
+ */
static bool check_pte(struct page_vma_mapped_walk *pvmw)
{
+ struct page *page;
+
if (pvmw->flags & PVMW_MIGRATION) {
#ifdef CONFIG_MIGRATION
swp_entry_t entry;
@@ -41,37 +61,41 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)

if (!is_migration_entry(entry))
return false;
- if (migration_entry_to_page(entry) - pvmw->page >=
- hpage_nr_pages(pvmw->page)) {
- return false;
- }
- if (migration_entry_to_page(entry) < pvmw->page)
- return false;
+
+ page = migration_entry_to_page(entry);
#else
WARN_ON_ONCE(1);
#endif
- } else {
- if (is_swap_pte(*pvmw->pte)) {
- swp_entry_t entry;
+ } else if (is_swap_pte(*pvmw->pte)) {
+ swp_entry_t entry;

- entry = pte_to_swp_entry(*pvmw->pte);
- if (is_device_private_entry(entry) &&
- device_private_entry_to_page(entry) == pvmw->page)
- return true;
- }
+ /* Handle un-addressable ZONE_DEVICE memory */
+ entry = pte_to_swp_entry(*pvmw->pte);
+ if (!is_device_private_entry(entry))
+ return false;

+ page = device_private_entry_to_page(entry);
+ } else {
if (!pte_present(*pvmw->pte))
return false;

- /* THP can be referenced by any subpage */
- if (pte_page(*pvmw->pte) - pvmw->page >=
- hpage_nr_pages(pvmw->page)) {
- return false;
- }
- if (pte_page(*pvmw->pte) < pvmw->page)
- return false;
+ page = pte_page(*pvmw->pte);
}

+ /*
+ * Make sure that pages are in the same section before doing pointer
+ * arithmetics.
+ */
+ if (page_to_section(pvmw->page) != page_to_section(page))
+ return false;
+
+ if (page < pvmw->page)
+ return false;
+
+ /* THP can be referenced by any subpage */
+ if (page - pvmw->page >= hpage_nr_pages(pvmw->page))
+ return false;
+
return true;
}

--
Kirill A. Shutemov