Re: stop breaking dosemu (Re: x86/kconfig/32: Rename CONFIG_VM86 and default it to 'n')

From: Austin S Hemmelgarn
Date: Thu Sep 03 2015 - 14:52:15 EST

Next message: Andi Kleen: "Re: perf fails to mmap data file (JFFS2)"
Previous message: Alexander Kuleshov: "[PATCH v2] staging/wilc1000: Use %pM format specifier to print mac address"
In reply to: Stas Sergeev: "Re: stop breaking dosemu (Re: x86/kconfig/32: Rename CONFIG_VM86 and default it to 'n')"
Next in thread: Stas Sergeev: "Re: stop breaking dosemu (Re: x86/kconfig/32: Rename CONFIG_VM86 and default it to 'n')"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2015-09-03 12:34, Stas Sergeev wrote:

03.09.2015 18:44, Austin S Hemmelgarn ÐÐÑÐÑ:

On 2015-09-03 08:15, Stas Sergeev wrote:

03.09.2015 15:11, Austin S Hemmelgarn ÐÐÑÐÑ:

On 2015-09-02 17:53, Stas Sergeev wrote:

[... trimmed for brevity ...]

I don't think the attack scenario was satisfactory explained.
IIRC you only said that
---

The mark_screen_rdonly thing is still kind of scary. It changes PTEs
on arbitrary mappings behind the vm's back.

---
Just go ahead and remove mark_screen_rdonly, big deal.
Is this all of the threat?
Or do we treat _every_ syscall as the potential attack target?

Anything that messes with the VM subsystem (doubly if it does so without actually calling into the VM subsystem) is a potential target

... and should be removed.
Remove mark_screen_rdonly hack.

as is anything that messes with execution mode or privilege
level (as in, possibly messes with which ring (or whatevere equivalent metaphor other processors use) execution is happening in). This does potentially all three (depending on how it's called). Just
because there are no known working exploits doesn't mean it's not possible, and in the case of this code, I'd say there is almost certainly some way to exploit it either to crash the system or gain
root-equivalent privileges.

Please be specific, show the dangerous code, we'll then remove it
or fix it.

The problem is we don't _know_ what could be exploited in there. There is no way to know for certain without a full audit of the code

As was indicated in this thread already:
https://lkml.org/lkml/2015/9/2/317
Brian Gerst recently audited it:
---
That's
hopefully in much better shape now, though.
---

By audit, I don't mean just one person trying to make it more maintainable and fixing any bugs he found, I mean a team of people actively trying to make it break in every way imaginable. I'd be particularly interested to see how it reacts to being hit from multiple cores concurrently with trinity.

We should not however, wait to disable something by default that (probably) less than 1% of the people who are running Linux on systems that can even use this are actually using

I am puzzled with this "probably".
Given that ubuntu and debian do provide it, and that (unmaintained)
SF page shows a few hundreds of downloads per week, how have you calculated
the probability of its user base being below 1% of all linux users?
Please provide more details so that I can double-check.

A few hundred downloads per week, as compared to tens of millions of people using Linux worldwide (rough guess, although probably conservative), with 10% of the Linux users using 32-bit x86 (again, another rough guess, although this one is more generous), still works out to around 1%. It's not possible to get exact numbers for this, and downloads from the SF page also happen for the automated build testing that most modern distributions do these days and a number of reasons other than people using it.

until someone
demonstrates a workable exploit. Security is not just a reactionary endeavor, you need to be proactive about it as well. This means minimizing the attack surface whenever possible (and yes, this an
potential attack vector, regardless of whether there are known workable exploits or not).

There are ways to minimize the risk: just remove the bloat, then
see what remains.
If you leave the bloat and just call it "dangerous", people will
start disabling it, because _then_ it will really be an unmaintained
attack target. So what you propose, is the worst solution, not the best.
It will threaten the current vm86() users instead of doing them a
favour by cleaning and fixing the code, and they will start looking
into abandoning it.

As of right now, the only open-source project that I know of that is actually actively used by people on new kernels that uses vm86 is dosemu (and the forked dosemu2). the only other open source user of vm86() that I know of is v86d, which is no longer needed except on ancient hardware with old kernels. And as far as proprietary code goes, they need to pull their heads out of DOS, realize that sane people use protected or long mode for modern software, and get on with their lives.

I'm not saying that we shouldn't improve the code, but that we need to provide the option to turn this off at runtime. Just one program that isn't used by a large segment of the community depending on something is not a good reason to make everyone have it turned on.

There are servers out there that have this enabled and _never_ use it at all, having a system call like this one usable but unused is a potential security hole, period, irrespective of the quality of the code the syscall executes.

As for abandoning it, that is happening already, 32-bit x86 systems are becoming more and more difficult to find, and it's not supported at all on 64-bit kernels.

What has been proposed follows the existing convention on Linux (don't break userspace, and provide the option to people who actually care about their systems being secure to turn it off), the current
proposal is to make it default to on in the defconfig, and have the sysctl default to leaving it enabled.

On top of this, vm86 has a set of very specific niche use cases, most syscalls like this (AIO, bpf(), seccomp(), {m,f}advise(), etc) can only be turned on and off by completely rebuilding the kernel.

"on and off"? Nice, but they are On by default (except for bpf()).
So the fact that they have no runtime knob doesn't look like a big
surprise.

Most of those (other than seccomp) are used almost exclusively in server applications, and in the case of AIO, it is possible to prevent anything from using it at runtime, but this can't be sanely relayed to any applications that use it (bonus points if you can figure out how to stop everyone from using it and why applications can't easily detect this).

This lets you turn this on or off at runtime.

With a big warning that "it is an attack surface and less than
1% of people use it, please don't touch"? No thankyou.

I'm not saying that such a warning should be put in, and based on the backlash that the original change that sparked this thread got, nothing like that is going to be put in, but there is no reason to not be able to enable/disable it at runtime. Most people who are using desktop systems are not going to inherently know if they need it or not until they do need it, and unlike many of the other syscalls that can be disabled, many people who are likely to be using it aren't the type who are comfortable compiling their own kernel.

I'll be looking into testing and sending the patch that removes
mark_screen_rdonly. Maybe then this thread will shift a bit from
guesses and assumptions.

My statement that there is a potential security risk inherent in vm86 is not a guess or assumption, it's a fact. Every single way that user code can call into the kernel is a potential attack vector, period, irrespective of what it does. You can't say with 100% certainty that something is not a possible attack vector unless it isn't there to begin with. While disabling it at runtime is not the best option from a security standpoint, it makes it a much more difficult to even try to exploit the code.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Next message: Andi Kleen: "Re: perf fails to mmap data file (JFFS2)"
Previous message: Alexander Kuleshov: "[PATCH v2] staging/wilc1000: Use %pM format specifier to print mac address"
In reply to: Stas Sergeev: "Re: stop breaking dosemu (Re: x86/kconfig/32: Rename CONFIG_VM86 and default it to 'n')"
Next in thread: Stas Sergeev: "Re: stop breaking dosemu (Re: x86/kconfig/32: Rename CONFIG_VM86 and default it to 'n')"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]