Re: [patch V2 1/2] sysfs/cpu: Add vulnerability folder

From: Andrea Arcangeli
Date: Fri Jan 26 2018 - 11:23:45 EST


Hello,

On Sun, Jan 07, 2018 at 10:48:00PM +0100, Thomas Gleixner wrote:
> +static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
> +static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
> +static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);

This sysfs feature implemented as above is weakening kernel security,
it should be 0400 above.

It doesn't make sense to expose to luser when a software fix (or even
only a software mitigation) has been disabled at build time to gain
all performance back (see CONFIG_RETPOLINE=n config option).

$ cat /boot/kernel-`uname -r`
cat: /boot/kernel-4.15.0-rc9+: Permission denied
$ cat /lib/modules/`uname -r`/kernel/arch/x86/kvm/kvm.ko
cat: /lib/modules/4.15.0-rc9+/kernel/arch/x86/kvm/kvm.ko: Permission denied
$ dmesg
dmesg: read kernel buffer failed: Operation not permitted

Noticing when cr3 is flipped in kernel/exit is easy, but noticing when
the syscall table or the whole kernel has been built with retpolines
is not trivial to detect. Same for variant#1.

Exploiting spectre variant#2 for an attacker is not without risk of
being detected while the setup is being mounted, as the CPU load would
spike for hours.

I may notice if a random background network daemon suddenly starts
running at 100% CPU load for hours (especially on mobile devices that
would be physically noticeable).

Containers shouldn't have sysfs and you can workaround the above if
you run all network daemons behind mount namespaces, but in general
leaving this directory readable by luser is weaking security because
it exposes when mounting a variant#2 attack can succeed.

It even tells when it is worth to focus on the syscall_table indirect
call or if the attack needs to dig deeper because asm retpolines were
used, but the kernel was built with a gcc without retpolines.

The only case where the above isn't weakening security is when the
full fix is on for all the variants is enabled (and variant#1 for now
just shows vulnerable..).

For the same reasons the whole directory, not just the files, should be
0500, especially if this would be used for any other equivalent issue
in the future and it won't stick to these 3 files, I didn't implement
that yet, because it's less urgent if nobody adds any more files soon.