Re: stop breaking dosemu (Re: x86/kconfig/32: Rename CONFIG_VM86 and default it to 'n')

From: Austin S Hemmelgarn
Date: Thu Sep 03 2015 - 11:45:48 EST

On 2015-09-03 08:15, Stas Sergeev wrote:
03.09.2015 15:11, Austin S Hemmelgarn ÐÐÑÐÑ:
On 2015-09-02 17:53, Stas Sergeev wrote:
03.09.2015 00:40, Andy Lutomirski ÐÐÑÐÑ:
On Wed, Sep 2, 2015 at 2:12 PM, Stas Sergeev <stsp@xxxxxxx> wrote:
02.09.2015 23:55, Andy Lutomirski ÐÐÑÐÑ:

On Wed, Sep 2, 2015 at 1:47 PM, Stas Sergeev <stsp@xxxxxxx> wrote:
02.09.2015 23:22, Josh Boyer ÐÐÑÐÑ:
On Wed, Sep 2, 2015 at 1:50 PM, Stas Sergeev <stsp@xxxxxxx> wrote:
02.09.2015 20:46, Josh Boyer ÐÐÑÐÑ:
On Wed, Sep 2, 2015 at 10:08 AM, Andy Lutomirski
I'd be amenable to switching the default back to y and perhaps
a sysctl to make the distros more comfortable. Ingo, Kees, Brian,
what do you think?
Can you please leave the default as N, and have a sysctl option to
enable it instead? While dosemu might still be in use, it isn't
to be the common case at all. So from a distro perspective, I
we'd probably rather have the default match the common case.
The fact that fedora doesn't package dosemu, doesn't automatically
mean all other distros do not too. Since when kernel defaults should
match the ones of fedora?
I didn't say that.
What you said was:

While dosemu might still be in use, it isn't going
to be the common case at all. So from a distro perspective

... which is likely true only in fedora circe.

The default right now is N.
In a not yet released kernel, unless I am mistaken.
If fedora already provides that kernel, other distros likely not.

I asked it be left
that way. That's all.
Lets assume its not yet N, unless there was a kernel release already.
Its easy to get back if its not too late.
How about CONFIG_SYSCTL_VM86_DEFAULT which defaults to Y? Fedora
could set it to N.
Sorry, I don't understand this sysctl proposal.
Could you please educate me what is it all about?
This sysctl will disable or enable the vm86() syscall at run-time,
right? What does it give us? If you disable something in the
config, this gives you, say, smaller kernel image. If OTOH you
add the run-time switch, it gives you a bigger image, regardless
of its default value.
I might be missing something, but I don't understand what
problem will this solve? Have I missed some earlier message
in this thread?
For the 99%+ of users who don't use dosemu, it prevents exploits that
target vm86 from attacking their kernel.
I don't think the attack scenario was satisfactory explained.
IIRC you only said that

The mark_screen_rdonly thing is still kind of scary. It changes PTEs
on arbitrary mappings behind the vm's back.

Just go ahead and remove mark_screen_rdonly, big deal.
Is this all of the threat?
Or do we treat _every_ syscall as the potential attack target?
Anything that messes with the VM subsystem (doubly if it does so without actually calling into the VM subsystem) is a potential target
... and should be removed.
Remove mark_screen_rdonly hack.

as is anything that messes with execution mode or privilege
level (as in, possibly messes with which ring (or whatevere equivalent metaphor other processors use) execution is happening in). This does potentially all three (depending on how it's called). Just
because there are no known working exploits doesn't mean it's not possible, and in the case of this code, I'd say there is almost certainly some way to exploit it either to crash the system or gain
root-equivalent privileges.
Please be specific, show the dangerous code, we'll then remove it
or fix it.

The problem is we don't _know_ what could be exploited in there. There is no way to know for certain without a full audit of the code (and even that wouldn't be certain to catch everything), which is almost certainly not going to happen unless someone pays a very large amount of money for it.

We should not however, wait to disable something by default that (probably) less than 1% of the people who are running Linux on systems that can even use this are actually using until someone demonstrates a workable exploit. Security is not just a reactionary endeavor, you need to be proactive about it as well. This means minimizing the attack surface whenever possible (and yes, this an potential attack vector, regardless of whether there are known workable exploits or not).

What has been proposed follows the existing convention on Linux (don't break userspace, and provide the option to people who actually care about their systems being secure to turn it off), the current proposal is to make it default to on in the defconfig, and have the sysctl default to leaving it enabled.

On top of this, vm86 has a set of very specific niche use cases, most syscalls like this (AIO, bpf(), seccomp(), {m,f}advise(), etc) can only be turned on and off by completely rebuilding the kernel. This lets you turn this on or off at runtime.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature