Re: [PATCH 1/2] x86/arch_prctl: add ARCH_SET_{COMPAT,NATIVE} to change compatible mode

From: Andy Lutomirski
Date: Fri Apr 08 2016 - 11:57:03 EST


On Fri, Apr 8, 2016 at 6:50 AM, Dmitry Safonov <dsafonov@xxxxxxxxxxxxx> wrote:
> On 04/07/2016 05:39 PM, Andy Lutomirski wrote:
>>
>> For 32-bit, the vdso *must* exist in memory at the address that the
>> kernel thinks it's at. Even if you had a pure 32-bit restore stub,
>> you would still need vdso remap, because there's a chance the vdso
>> could land at an unusable address, say one page off from where you
>> want it. You couldn't map a wrapper because there wouldn't be any
>> space for it without moving the real vdso out of the way.
>>
>> Remember, you *cannot* mremap() the 32-bit vdso because you will
>> crash. It works by luck for 64-bit, but it's plausible that we'd want
>> to change that some day. (I have awful patches that speed a bunch of
>> things up at the cost of a vdso trampoline for 64-bit code and a bunch
>> of other hacks. Those patches will never go in for real, but
>> something else might want the ability to use 64-bit vdso trampolines.)
>
> Hello again,
> what do you think about attached patch?
> I think it should fix landing problem for i386 vdso mremap.
> It does not touch fast syscall path, so there should be no
> speed regression.
>>>
>>> I did remapping for vdso as blob for native x86_64 task differs
>>> to compatible task. So it's just changing blobs, address value
>>> is there for convenience - I may omit it and just remap
>>> different vdso blob at the same place where was previous vdso.
>>> I'm not sure, why do we need possibility to map 64-bit vdso blob
>>> on native 32-bit builds?
>>
>> That would fail, but I think the API should exist. But a native
>> 32-bit program should be able to remap the 32-bit vdso.
>>
>> IOW, I think you should be able to do, roughly:
>>
>> map_new_vdso(VDSO_32BIT, addr);
>>
>> on any kernel.
>>
>> Am I making sense?
>
> I will still work for this interface - just wanted that
> usuall mremap to work on vdso mappings.

For this thing:

+ /* Fixing userspace landing - look at do_fast_syscall_32 */
+ if (current_thread_info()->status & TS_COMPAT)
+ regs->ip = (unsigned long)current->mm->context.vdso +
+ vdso_image_32.sym_int80_landing_pad;

Either check that ip was where you expected it or simply remove this
code -- user programs that are mremapping the vdso are already playing
with fire and can just use int $0x80 to do it.

Other than that, it looks generally sane. The .mremap hook didn't
exist last time I looked at this :)

The main downside of your approach is that it doesn't allow switching
between the 32-bit, 64-bit, and x32 images. Also, it requires
awareness of how vvar and vdso line up, whereas a dedicated API could
do the whole thing.

>
> Thanks,
> Dmitry.



--
Andy Lutomirski
AMA Capital Management, LLC