Re: [PATCH] x86/kconfig/32: Mark CONFIG_VM86 as BROKEN

From: Linus Torvalds
Date: Fri Jul 10 2015 - 13:04:33 EST


On Fri, Jul 10, 2015 at 9:44 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
> That's not what I mean. I'm referring to the vm86 syscall itself. If
> you have a ti flag that causes the slow exit path to be used, then you
> call vm86. vm86 sets up the ludicrous double stack frame that it uses
> and jumps back to the exit asm. The exit asm then branches off to the
> slow path, hits the notifysig_v86 kludge, calls save_v86_state, tears
> down its double stack frame, and keeps meandering back through the
> exit asm. We finally IRET right back to protected mode, and the code
> that userspace was trying to execute in v8086 mode never actually
> runs.

So?

Yes, we exit vm86 mode if anything odd happens. That's very much part
of the whole vm86() model. If the kernel needs to do anything, it
saves off the vm86 state and returns to regular 32-bit mode. That's
how it's designed to be.

What's your point?

The user mode "vm86 hypervisor" will call vm86() in a loop. Always
has. Always will.

And yes, that can mean that you never execute even a single
instruction in vm86 mode, if one of the "we have other work to do"
flags are set. Maybe a signal came in. Maybe just a delayed work
happened. Maybe it has nothing to do with user space, and we *could*
have returned to vm86 mode, but the thing is, that code sequence is
_designed_ that way - it's very much minimizing the impact of vm86
mode. Pretty much the *only* thing we ever do with the vm86 stack
still active is reschedule. Pretty much *any* other context change
issue will get rid of the vm86 mode in kernel space, saving back the
state to user space so that user space can try again.

An it was done that way to minimize the vm86 impact on the rest of the
kernel. Basically there's a few hooks in a couple of traps that say
"ok, let's handle this case for vm86 mode", and there's the "let's
reschedule without exiting the user vm86 state", but the code is
designed so that we'll just say "screw it, the user can restart, we'll
go back to normal 32-bit code because something else than just plain
returning to vm86 mode happend".

vm86() mode is not some kind of "run this DOS program to completion".
It's exactly like a (very stupid) vmx mode. There are exit conditions,
and while many of them are about the code it executes, equally many of
them are "oh, we may have some event that cannot be handled in vm86
mode like a signal happened" etc.

So yes, if the thread work flags are set, we never enter vm86 mode.
BUT THAT'S EXACTLY WHAT SHOULD HAPPEN.

It worries me that you think these kinds of fundamental issues are
completely broken.

No, I wouldn't be surprised at all if there is actual breakage, just
because vm86 mode clearly gets very little testing, but the things you
have pointed out as "broken" really haven't been as far as I can tell.

And yes, if you enable system call auditing, and you actually audit
the vm86 mode system call, that probably causes an exit condition,
which means that you can't actually run vm86 mode and make progress if
you audit that system call. Big f*cking deal. People who enable system
call auditing break many more important things (eg basic performance)
that that isn't even an argument. Do you really think that people who
wanted to run DOS games at hardware speeds wanted to _audit_ those
games? No.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/