Re: [PATCHv2 3/6] x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
From: Andy Lutomirski
Date: Sun Jul 10 2016 - 08:45:08 EST
On Thu, Jul 7, 2016 at 4:11 AM, Dmitry Safonov <dsafonov@xxxxxxxxxxxxx> wrote:
> On 07/06/2016 05:30 PM, Andy Lutomirski wrote:
>>
>> On Wed, Jun 29, 2016 at 3:57 AM, Dmitry Safonov <dsafonov@xxxxxxxxxxxxx>
>> wrote:
>>>
>>> Add API to change vdso blob type with arch_prctl.
>>> As this is usefull only by needs of CRIU, expose
>>> this interface under CONFIG_CHECKPOINT_RESTORE.
>>
>>
>>> +#ifdef CONFIG_CHECKPOINT_RESTORE
>>> + case ARCH_MAP_VDSO_X32:
>>> + return do_map_vdso(VDSO_X32, addr, false);
>>> + case ARCH_MAP_VDSO_32:
>>> + return do_map_vdso(VDSO_32, addr, false);
>>> + case ARCH_MAP_VDSO_64:
>>> + return do_map_vdso(VDSO_64, addr, false);
>>> +#endif
>>> +
>>
>>
>> This will have an odd side effect: if the old mapping is still around,
>> its .fault will start behaving erratically. I wonder if we can either
>> reliably zap the old vma (or check that it's not there any more)
>> before mapping a new one or whether we can associate the vdso image
>> with the vma (possibly by having a separate vm_special_mapping for
>> each vdso_image. The latter is quite easy: change vdso_image to embed
>> vm_special_mapping and use container_of in vdso_fault to fish the
>> vdso_image back out. But we'd have to embed another
>> vm_special_mapping for the vvar mapping as well for the same reason.
>>
>> I'm also a bit concerned that __install_special_mapping might not get
>> all the cgroup and rlimit stuff right. If we ensure that any old
>> mappings are gone, then the damage is bounded, but otherwise someone
>> might call this in a loop and fill their address space with arbitrary
>> numbers of special mappings.
>
>
> Well, I have deleted code that unmaps old vdso because I didn't saw
> a reason why it's bad and wanted to reduce code. But well, now I do see
> reasons, thanks.
>
> Hmm, what do you think if I do it a little different way then embedding
> vm_special_mapping: just that old hack with vma_ops. If I add a close()
> hook there and make there context.vdso = NULL pointer, then I can test
> it on remap. This can also have nice feature as restricting partial
> munmap of vdso blob. Is this sounds sane?
I think so, as long as you do something to make sure that vvar gets
unmapped as well.
Oleg, want to sanity-check us? Do you believe that if .mremap ensures
that only entire vma can be remapped and .close ensures that only the
whole vma can be unmapped, are we okay? Or will we have issues with
mprotect?
--
Andy Lutomirski
AMA Capital Management, LLC