Re: stop breaking dosemu (Re: x86/kconfig/32: Rename CONFIG_VM86 and default it to 'n')

From: Austin S Hemmelgarn
Date: Fri Sep 04 2015 - 15:52:22 EST

On 2015-09-04 09:06, Stas Sergeev wrote:
04.09.2015 15:34, Austin S Hemmelgarn ÐÐÑÐÑ:
On 2015-09-04 06:46, Stas Sergeev wrote:
04.09.2015 13:09, Chuck Ebbert ÐÐÑÐÑ:
On Fri, 4 Sep 2015 00:28:04 +0300
Stas Sergeev <stsp@xxxxxxx> wrote:

03.09.2015 21:51, Austin S Hemmelgarn ÐÐÑÐÑ:
There are servers out there that have this enabled and _never_ use it
at all,
Unless I am mistaken, servers usually use special flavour of the
distro (different from desktop install), where of course this will
be disabled _compile time_.
Many (most?) distros use just one kernel for everything, because it's
just too much work to have a separate flavor for servers.
But for example menuconfig promotes CONFIG_PREEMPT_NONE for server
and CONFIG_PREEMPT for desktop. Also perhaps server would need an
lts version rather than latest.
I wonder if RHEL Server offers the generic desktop-suited kernel
with vm86() enabled?

In any case, if there is some generic mechanism to selectively
disable syscalls at run-time for server, then vm86() is of course
a good candidate. I wonder how many other syscalls are currently
run-time controlled? (those that are not marked as an "attack surface"
and defaulted to Y; I suppose the "attack surface" is currently only vm86())

OK, I think I need to clarify something here.

The attack surface of a given system refers to the number of different ways that someone could potentially attack that system. An individual syscall is not in itself an attack surface, but is part of
the attack surface for the whole system. One of the core concepts of proactive security is to minimize the attack surface, because the fewer ways someone could possibly attack you, the less likely it
is that they will succeed.

I however, referred to vm86 as a potential attack vector, which refers one way in which someone could attempt to attack the system (be it through arbitrary code execution , privilege escalation, or
some other type of exploit), note that something does not need to have a known exploit to be classified as a potential attack vector (most black hat's out there will keep quiet about discovered
exploits until they can actually make use of them themselves). By their very definition, every single site that userspace can call into the kernel is a _potential_ attack vector, including vm86().
But they are not marked as such, while vm86() is.
And they do not have a run-time disabling knob.
So why is such a big difference?
Take for example read(), this is not a very likely attack vector because:
1. It does exactly _one_ thing.
2. It only copies data to the calling process.
3. It has no odd interactions with mm.
4. The only modification it does to how the processor is executing is for the context switch to kernel mode and back to user mode.
5. It is _very_ well audited.
Overall, this means that read() is a relatively low risk.
fork() is slightly more attractive as an exploit target, because it doesn't fit points 2 and 4 above.
vm86() is much more attractive because it doesn't fit any of the 5 points above. Other system calls that I know of that fit less than 3 of the 5 points above are: modify_ldt(), perf_event_open(), ptrace(), and bpf(). I regard all of these as potentially more attractive than vm86 because they are available on a wider range of platforms. modify_ldt, perf_event_open, and ptrace all have ways to disable or significantly secure them, and have also all had exploits at some point in time. bpf is able to be disabled, but has not yet had any publicly documented exploits that I know of, but this does not mean that it is secure (especially considering how new it is).

vm86() is one of the more attractive syscalls to attempt to use as an attack vector on 32-bit x86 systems because it's relatively unaudited,
This can be changed if it is at least stripped from the known
bloat, for example. This could have been done _before_ taking any
other actions on it, because the actions would then be entirely
different. Maybe, if it is properly cleaned up, the action will
change from disabling or introducing a knob to auditing it?
If you clean it up, I'd be happy to throw every thing I can think of at it. Even if I don't manage to discover any exploits in that case, I would still advocate against having it availible by default because it's functionality that is used by an consistently decreasing percentage of users (yes, I know lots of people use dosemu, the number of people who use Linux is however going up faster than the number of people who use dosemu (no, I don't have numbers to back this up, but it is statistically very likely to be the case), and I know a number of people who used to use it (myself included) who are moving to dosbox because the performance difference is getting less significant as computers get faster).
significantly modifies the execution state of the
processor, and is available on a majority of 32-bit x85 systems in the wild. This does not mean that it is exploitable directly, just that it's a possible target for an exploit.
So you say it is more dangerous than other syscalls, and I can
believe you, but this needs a proper justification. Someone have
to write why exactly it is more dangerous, can it be fixed or not,
etc. Like it was done for mark_screen_rdonly - I am not asking you
how it can be exploited because I take your word that this code is
a potential risk. But it can be removed. If there are other risky
parts, they also have to be identified. I simply don't think the
sufficient justification was spelled to consider it as more dangerous
than all other syscalls (modulo mmap_min_addr - that one was identified).
I've already stated _why_ it's more dangerous:
1. It interacts in odd ways with memory management.
2. It directly modifies the execution state of the processor.
It is no more potentially dangerous than any other system call that fits either description, I'm not trying to single out vm86, that just happens to be the syscall we are discussing right now. Another syscall that is a perfect example of both 1 and 2 would be modify_ldt, which _does_ have known exploits that required a rewrite, and now has a knob to disable it because most people don't use it. On almost any other OS out there, anything that did either 1 or 2 wouldn't have been merged in the first place (this is not intended as a statement against Linux), and to be honest, if someone tried to merge vm86 into Linux today, they would have a very hard time convincing people it is worth it.

Reiterating what I've said before, albeit paraphrased:
1. If you can call code, there is a possibility that you can exploit it.
2. Just because there are no publicly documented exploits for something does not mean that it is secure.
3. Having functionality enabled by default that you don't need is a Very Bad Thing, this is why Windows has historically had so many security issues.
4. Reactive security is utterly useless for any system that has already been exploited. If you have been hacked by someone who actually knows what they are doing, then even your hardware is suspect at that point, and patching the initial entry point will not provide any reasonable degree of safety.

Also, this will be the last reply I make on this sub-thread, if this does not convince you of any of the points I've made, then nothing I can say is likely to.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature