Re: [PATCH] powerpc/vdso: Separate vvar vma from vdso

From: Michael Ellerman
Date: Wed Mar 31 2021 - 06:00:22 EST


Christophe Leroy <christophe.leroy@xxxxxxxxxx> writes:
> Le 26/03/2021 à 20:17, Dmitry Safonov a écrit :
>> Since commit 511157ab641e ("powerpc/vdso: Move vdso datapage up front")
>> VVAR page is in front of the VDSO area. In result it breaks CRIU
>> (Checkpoint Restore In Userspace) [1], where CRIU expects that "[vdso]"
>> from /proc/../maps points at ELF/vdso image, rather than at VVAR data page.
>> Laurent made a patch to keep CRIU working (by reading aux vector).
>> But I think it still makes sence to separate two mappings into different
>> VMAs. It will also make ppc64 less "special" for userspace and as
>> a side-bonus will make VVAR page un-writable by debugger (which previously
>> would COW page and can be unexpected).
>>
>> I opportunistically Cc stable on it: I understand that usually such
>> stuff isn't a stable material, but that will allow us in CRIU have
>> one workaround less that is needed just for one release (v5.11) on
>> one platform (ppc64), which we otherwise have to maintain.
>> I wouldn't go as far as to say that the commit 511157ab641e is ABI
>> regression as no other userspace got broken, but I'd really appreciate
>> if it gets backported to v5.11 after v5.12 is released, so as not
>> to complicate already non-simple CRIU-vdso code. Thanks!
>>
>> Cc: Andrei Vagin <avagin@xxxxxxxxx>
>> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
>> Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
>> Cc: Christophe Leroy <christophe.leroy@xxxxxxxxxx>
>> Cc: Laurent Dufour <ldufour@xxxxxxxxxxxxx>
>> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
>> Cc: Paul Mackerras <paulus@xxxxxxxxx>
>> Cc: linuxppc-dev@xxxxxxxxxxxxxxxx
>> Cc: stable@xxxxxxxxxxxxxxx # v5.11
>> [1]: https://github.com/checkpoint-restore/criu/issues/1417
>> Signed-off-by: Dmitry Safonov <dima@xxxxxxxxxx>
>> Tested-by: Christophe Leroy <christophe.leroy@xxxxxxxxxx>
>> ---
>> arch/powerpc/include/asm/mmu_context.h | 2 +-
>> arch/powerpc/kernel/vdso.c | 54 +++++++++++++++++++-------
>> 2 files changed, 40 insertions(+), 16 deletions(-)
>>
>
>> @@ -133,7 +135,13 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
>> * install_special_mapping or the perf counter mmap tracking code
>> * will fail to recognise it as a vDSO.
>> */
>> - mm->context.vdso = (void __user *)vdso_base + PAGE_SIZE;
>> + mm->context.vdso = (void __user *)vdso_base + vvar_size;
>> +
>> + vma = _install_special_mapping(mm, vdso_base, vvar_size,
>> + VM_READ | VM_MAYREAD | VM_IO |
>> + VM_DONTDUMP | VM_PFNMAP, &vvar_spec);
>> + if (IS_ERR(vma))
>> + return PTR_ERR(vma);
>>
>> /*
>> * our vma flags don't have VM_WRITE so by default, the process isn't
>
>
> IIUC, VM_PFNMAP is for when we have a vvar_fault handler.

Some of the other flags seem odd too.
eg. VM_IO ? VM_DONTDUMP ?


cheers