[PATCH 4.4 126/133] x86/mm: Fix kern_addr_valid() to cope with existing but not present entries
From: Greg Kroah-Hartman
Date: Mon Sep 20 2021 - 12:55:38 EST
From: Mike Rapoport <rppt@xxxxxxxxxxxxx>
commit 34b1999da935a33be6239226bfa6cd4f704c5c88 upstream.
Jiri Olsa reported a fault when running:
# cat /proc/kallsyms | grep ksys_read
ffffffff8136d580 T ksys_read
# objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
/proc/kcore: file format elf64-x86-64
Segmentation fault
general protection fault, probably for non-canonical address 0xf887ffcbff000: 0000 [#1] SMP PTI
CPU: 12 PID: 1079 Comm: objdump Not tainted 5.14.0-rc5qemu+ #508
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
RIP: 0010:kern_addr_valid
Call Trace:
read_kcore
? rcu_read_lock_sched_held
? rcu_read_lock_sched_held
? rcu_read_lock_sched_held
? trace_hardirqs_on
? rcu_read_lock_sched_held
? lock_acquire
? lock_acquire
? rcu_read_lock_sched_held
? lock_acquire
? rcu_read_lock_sched_held
? rcu_read_lock_sched_held
? rcu_read_lock_sched_held
? lock_release
? _raw_spin_unlock
? __handle_mm_fault
? rcu_read_lock_sched_held
? lock_acquire
? rcu_read_lock_sched_held
? lock_release
proc_reg_read
? vfs_read
vfs_read
ksys_read
do_syscall_64
entry_SYSCALL_64_after_hwframe
The fault happens because kern_addr_valid() dereferences existent but not
present PMD in the high kernel mappings.
Such PMDs are created when free_kernel_image_pages() frees regions larger
than 2Mb. In this case, a part of the freed memory is mapped with PMDs and
the set_memory_np_noalias() -> ... -> __change_page_attr() sequence will
mark the PMD as not present rather than wipe it completely.
Have kern_addr_valid() check whether higher level page table entries are
present before trying to dereference them to fix this issue and to avoid
similar issues in the future.
Stable backporting note:
------------------------
Note that the stable marking is for all active stable branches because
there could be cases where pagetable entries exist but are not valid -
see 9a14aefc1d28 ("x86: cpa, fix lookup_address"), for example. So make
sure to be on the safe side here and use pXY_present() accessors rather
than pXY_none() which could #GP when accessing pages in the direct map.
Also see:
c40a56a7818c ("x86/mm/init: Remove freed kernel image areas from alias mapping")
for more info.
Reported-by: Jiri Olsa <jolsa@xxxxxxxxxx>
Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
Reviewed-by: David Hildenbrand <david@xxxxxxxxxx>
Acked-by: Dave Hansen <dave.hansen@xxxxxxxxx>
Tested-by: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx> # 4.4+
Link: https://lkml.kernel.org/r/20210819132717.19358-1-rppt@xxxxxxxxxx
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
arch/x86/mm/init_64.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1182,21 +1182,21 @@ int kern_addr_valid(unsigned long addr)
return 0;
pud = pud_offset(pgd, addr);
- if (pud_none(*pud))
+ if (!pud_present(*pud))
return 0;
if (pud_large(*pud))
return pfn_valid(pud_pfn(*pud));
pmd = pmd_offset(pud, addr);
- if (pmd_none(*pmd))
+ if (!pmd_present(*pmd))
return 0;
if (pmd_large(*pmd))
return pfn_valid(pmd_pfn(*pmd));
pte = pte_offset_kernel(pmd, addr);
- if (pte_none(*pte))
+ if (!pte_present(*pte))
return 0;
return pfn_valid(pte_pfn(*pte));