Re: [PATCH] arm64/vdso: Support mremap() for vDSO
From: Will Deacon
Date: Wed Aug 02 2017 - 12:04:21 EST
On Fri, Jul 28, 2017 at 10:06:20PM +0300, Dmitry Safonov wrote:
> 2017-07-28 19:48 GMT+03:00 Will Deacon <will.deacon@xxxxxxx>:
> > On Wed, Jul 26, 2017 at 08:07:37PM +0300, Dmitry Safonov wrote:
> >> vDSO VMA address is saved in mm_context for the purpose of using
> >> restorer from vDSO page to return to userspace after signal handling.
> >>
> >> In Checkpoint Restore in Userspace (CRIU) project we place vDSO VMA
> >> on restore back to the place where it was on the dump.
> >> With the exception for x86 (where there is API to map vDSO with
> >> arch_prctl()), we move vDSO inherited from CRIU task to restoree
> >> position by mremap().
> >>
> >> CRIU does support arm64 architecture, but kernel doesn't update
> >> context.vdso pointer after mremap(). Which results in translation
> >> fault after signal handling on restored application:
> >> https://github.com/xemul/criu/issues/288
> >>
> >> Make vDSO code track the VMA address by supplying .mremap() fops
> >> the same way it's done for x86 and arm32 by:
> >> commit b059a453b1cf ("x86/vdso: Add mremap hook to vm_special_mapping")
> >> commit 280e87e98c09 ("ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO").
> >>
> >> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> >> Cc: Will Deacon <will.deacon@xxxxxxx>
> >> Cc: Russell King <rmk+kernel@xxxxxxxxxxxxxxx>
> >> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> >> Cc: Cyrill Gorcunov <gorcunov@xxxxxxxxxx>
> >> Cc: Pavel Emelyanov <xemul@xxxxxxxxxxxxx>
> >> Cc: Christopher Covington <cov@xxxxxxxxxxxxxx>
> >> Signed-off-by: Dmitry Safonov <dsafonov@xxxxxxxxxxxxx>
> >> ---
> >> arch/arm64/kernel/vdso.c | 15 +++++++++++++++
> >> 1 file changed, 15 insertions(+)
> >>
> >> diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
> >> index e8f759f764f2..2d419006ad43 100644
> >> --- a/arch/arm64/kernel/vdso.c
> >> +++ b/arch/arm64/kernel/vdso.c
> >> @@ -110,12 +110,27 @@ int aarch32_setup_vectors_page(struct linux_binprm *bprm, int uses_interp)
> >> }
> >> #endif /* CONFIG_COMPAT */
> >>
> >> +static int vdso_mremap(const struct vm_special_mapping *sm,
> >> + struct vm_area_struct *new_vma)
> >> +{
> >> + unsigned long new_size = new_vma->vm_end - new_vma->vm_start;
> >> + unsigned long vdso_size = vdso_end - vdso_start;
> >
> > You might be able to use vdso_pages here, but it depends on my question
> > below.
>
> Yes, shifting with PAGE_SHIFT.
> Is it just a preference?
Yeah, just a minor thing, although thinking about it again, I don't know
what you're trying to achieve with the size check anyway. Userspace is only
going to hurt itself if it screws up the layout, so why police this?
> >
> >> +
> >> + if (vdso_size != new_size)
> >> + return -EINVAL;
> >> +
> >> + current->mm->context.vdso = (void *)new_vma->vm_start;
> >> +
> >> + return 0;
> >> +}
> >> +
> >> static struct vm_special_mapping vdso_spec[2] __ro_after_init = {
> >> {
> >> .name = "[vvar]",
> >> },
> >> {
> >> .name = "[vdso]",
> >> + .mremap = vdso_mremap,
> >
> > Does this mean we move the vdso text, but not the data page? How does that
> > work?
>
> Well, the kernel tracks only vdso pages - to find restorer addr after a signal.
> In userspace one needs to move vvar and vdso vma pair accordingly,
> with the same order and offset of course.
Ah, I see. I misunderstood what the .mremap callback was actually doing.
I guess there's also no issue with not being able to do this atomically,
either, as long as you can avoid making syscalls via the vDSO until you've
relocated both mappings.
Will