Re: [PATCH] powerpc/vdso: Separate vvar vma from vdso

From: Christophe Leroy
Date: Tue Mar 30 2021 - 04:42:19 EST




Le 26/03/2021 à 20:17, Dmitry Safonov a écrit :
Since commit 511157ab641e ("powerpc/vdso: Move vdso datapage up front")
VVAR page is in front of the VDSO area. In result it breaks CRIU
(Checkpoint Restore In Userspace) [1], where CRIU expects that "[vdso]"
from /proc/../maps points at ELF/vdso image, rather than at VVAR data page.
Laurent made a patch to keep CRIU working (by reading aux vector).
But I think it still makes sence to separate two mappings into different
VMAs. It will also make ppc64 less "special" for userspace and as
a side-bonus will make VVAR page un-writable by debugger (which previously
would COW page and can be unexpected).

I opportunistically Cc stable on it: I understand that usually such
stuff isn't a stable material, but that will allow us in CRIU have
one workaround less that is needed just for one release (v5.11) on
one platform (ppc64), which we otherwise have to maintain.
I wouldn't go as far as to say that the commit 511157ab641e is ABI
regression as no other userspace got broken, but I'd really appreciate
if it gets backported to v5.11 after v5.12 is released, so as not
to complicate already non-simple CRIU-vdso code. Thanks!

Cc: Andrei Vagin <avagin@xxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
Cc: Christophe Leroy <christophe.leroy@xxxxxxxxxx>
Cc: Laurent Dufour <ldufour@xxxxxxxxxxxxx>
Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Cc: Paul Mackerras <paulus@xxxxxxxxx>
Cc: linuxppc-dev@xxxxxxxxxxxxxxxx
Cc: stable@xxxxxxxxxxxxxxx # v5.11
[1]: https://github.com/checkpoint-restore/criu/issues/1417
Signed-off-by: Dmitry Safonov <dima@xxxxxxxxxx>
Tested-by: Christophe Leroy <christophe.leroy@xxxxxxxxxx>
---
arch/powerpc/include/asm/mmu_context.h | 2 +-
arch/powerpc/kernel/vdso.c | 54 +++++++++++++++++++-------
2 files changed, 40 insertions(+), 16 deletions(-)


@@ -133,7 +135,13 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
* install_special_mapping or the perf counter mmap tracking code
* will fail to recognise it as a vDSO.
*/
- mm->context.vdso = (void __user *)vdso_base + PAGE_SIZE;
+ mm->context.vdso = (void __user *)vdso_base + vvar_size;
+
+ vma = _install_special_mapping(mm, vdso_base, vvar_size,
+ VM_READ | VM_MAYREAD | VM_IO |
+ VM_DONTDUMP | VM_PFNMAP, &vvar_spec);
+ if (IS_ERR(vma))
+ return PTR_ERR(vma);
/*
* our vma flags don't have VM_WRITE so by default, the process isn't


IIUC, VM_PFNMAP is for when we have a vvar_fault handler.
Allthough we will soon have one for handle TIME_NS, at the moment powerpc doesn't have that handler.
Isn't it dangerous to set VM_PFNMAP then ?

Christophe