Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN
From: Brian Gerst
Date: Wed Jul 08 2015 - 15:40:02 EST
On Wed, Jul 8, 2015 at 3:14 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Wed, Jul 8, 2015 at 12:05 PM, Brian Gerst <brgerst@xxxxxxxxx> wrote:
>> On Wed, Jul 8, 2015 at 1:30 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>> On Wed, Jul 8, 2015 at 9:59 AM, Linus Torvalds
>>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>> On Tue, Jul 7, 2015 at 7:33 PM, Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> wrote:
>>>>>
>>>>> if this patch would not be acceptable, at minimum we need some sort of "off
>>>>> by default
>>>>> unless the sysadmin flips a sysfs thing", which is really just a huge hack.
>>>>
>>>> The only thing that matters is whether people use this or not.
>>>>
>>>
>>> I think that the world contains precisely two programs that use the
>>> vm86 syscalls. One is dosemu, and one is a test case I wrote. (There
>>> are probably some exploits written by other people that I don't know
>>> about. Certainly Spender has been patching vm86 for long enough that
>>> he must have an exploit or two up his sleeve.)
>>>
>>> As far as I can tell (and I'll try to test this better for real later
>>> this week), dosemu already knows how to emulate real mode if vm86 is
>>> unavailable. So it's unclear that turning off the vm86 syscalls
>>> actually breaks anything whatsoever.
>>>
>>> On the other hand, sys_vm86 fails if the syscall slow path is in use.
>>> That means that quite a few Fedora versions (auditing), anything with
>>> ptrace, seccomp (before 3.16 IIRC), and anything with context tracking
>>> is probably actually *improved* by turning off the vm86 syscalls even
>>> for dosemu users.
>>>
>>> And apparently Ubuntu has had CONFIG_VM86 disabled forever.
>>>
>>> IOW, vm86 really is broken.
>>>
>>>> If people use vm86 mode, we can't just disable it. It's that simple.
>>>> "It's poorly maintained" isn't an argument for removal. Only "nobody
>>>> cares" works as an argument for that.
>>>>
>>>> My suspicion is that people still do use vm86 mode, but who knows..
>>>> Quite frankly, rather than disable it, I'd much rather see people who
>>>> modify low-level x86 code (yes, that means you, Luto) *test* it. If
>>>> you aren't willign to test the modifications you make, I don't think
>>>> those modifications should be merged, regardless of how nice a cleanup
>>>> they are.
>>>
>>> I tried to test it. As far as I know, my changes in -tip have no
>>> effect on vm86, and the changes I'm planning on sending this week will
>>> make it work better. I still thing that Linux users should have it
>>> configured out or deleted altogether. Especially people who care at
>>> all about security.
>>>
>>> It's easy to try the easy case (run from tools/testing/selftests/x86)
>>> -- this is v4.2-rc1, but most recent versions should be identical:
>>>
>>> $ ./entry_from_vm86_32
>>> [RUN] #BR from vm86 mode
>>> [OK] Exited vm86 mode due to #BR
>>> [RUN] SYSENTER from vm86 mode
>>> [OK] Exited vm86 mode due to unhandled GP fault
>>>
>>> $ strace -e vm86 ./entry_from_vm86_32
>>> [RUN] #BR from vm86 mode
>>> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS
>>> (Function not implemented)
>>> [OK] Exited vm86 mode due to type 0, arg 0
>>> [RUN] SYSENTER from vm86 mode
>>> vm86(0x1, 0xbfa50fcc, 0xbfa50fcc, 0x80488bb, 0x1000) = -1 ENOSYS
>>> (Function not implemented)
>>> [OK] Exited vm86 mode due to type 0, arg 0
>>>
>>> It only says "[OK]" because my test case isn't careful enough. That's
>>> a failure. I suspect it was a much worse failure a couple versions
>>> ago before my ENOSYS-reworking patch went in.
>>>
>>> Replace "-e vm86" with "-e write" and be puzzled. The failure mode is
>>> really pretty bad.
>>>
>>> This only tests easy stuff. The integration between vm86 and fault
>>> handling is truly awful and I don't even know how to approach testing
>>> it. I'd probably have to run twenty or thirty old real-mode games to
>>> even exercise those code paths.
>>>
>>> I'll try to confirm later this week that dosemu can really handle real
>>> mode without sys_vm86.
>>
>> None of these issues are unfixable. As I said before, many of them
>> can be resolved if vm86 is changed to use the normal syscall/exception
>> exit paths. Give me a few days to finish off that patch set.
>>
>
> I look forward to it.
>
> However: I imagine that, if you do this, you may need to be quite
> careful about an x86_32-ism. Currently, if you have a pt_regs pointer
> for the current entry and user_mode(regs) returns true, then regs ==
> current_pt_regs(). If you let user mode run with EFLAGS.VM set with
> the normal tss.sp0, then this will no longer be true, as the
> extra-long entry-from-v8086 frame will shift pt_regs by a few bytes.
> I don't know whether this matters, but I can imagine it causing
> do_signal to explode. *shudder*
I am aware that pt_regs is in a fixed location on the stack. What I
plan to do is increase the padding at the top of the stack if VM86 is
configured, to reserve space for the extra segment registers. Then it
will move tss.sp0 up 16 bytes when entering vm86 mode so that the
longer IRET frame is in the right place.
--
Brian Gerst
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/