Cache-controlling kernel APIs for user-space programs to defeat Meltdown/Spectre and to make more secure applications portably

From: ArcheFire LinuxMail
Date: Sun Jan 07 2018 - 18:15:36 EST


I've been thinking that the problem that makes Meltdown/Spectre
possible is a synchronization problem between the use of the cache by
all running processes and invalidating the cache when switching tasks
so that the contents of the cache for a process don't exist when
switching and running to another process.

I've been thinking that the best solution to prevent memory leakage
between processes through the cache would be to provide a new set of
API functions that allow user-space programs to control their usage of
the cache, for example, to disable the cache entirely for memory pages
allocated with malloc(), etc., so that variables/buffers with
sensitive data reside only in RAM but never in the cache, and thus
making it impossible to leak memory of that sensitive data from
another process.

For example, 32-bit protected mode of the x86 includes a bit in each
page entry that allows to specify whether that memory page will use
the cache or not:
http://wiki.osdev.org/File:Page_table.png

http://wiki.osdev.org/Paging


The new cache-controlling API functions could mainly allow a program
to control this bit, control cache flushes, etc., so that, given the
virtual addres of a buffer or variable, the kernel can resolve that
virtual address to mark the page entry to disable it from the cache.

If a program uses that for memory that it knows that will contain
user/password login data, or encrypted network traffic, then

Also, a program loader could be created to invoke programs as with
sudo, that when run, makes all data pages, and optionally code pages,
to have their use of the cache disabled in the page table entries so
that existing programs can run securely without the possibility of
cache leaks if for example an user wants to run a high-security
script.

Programs running as the root user could also have their usage of the
cache for data sections disabled for security reasons. It could be an
option that could be enabled or disabled, and there could be private
data sections in an executable that could specify whether to use the
CPU cache or not.

DMA memory is very fast and it's supposed to have its usage of the
cache disabled, so this case where we might want to disable the cache
to prevent sensitive data to be copied in random places of the cache
could prove more efficient than other ways to patch the kernel and
applications, but it would require a new set of API functions that
would allow user programs to enable/disable their cache for code/data
globally, as well as for variables configured by the program with that
API, and also flushing the cache when done dealing with sensitive
data.

The multitasking and paging code of the kernel could still need to be
improved to make sure that the cache doesn't contain any data from the
current program, when switching to another one, to prevent having data
that shouldn't reach another program when switching tasks.


The page fault vulnerabilities that use speculation/out of order
execution to use the cache loads as tests to determine private data
values could be addressed by flushing the cache and putting the
offending process at the end of the tasking queue for several cycles
when a page fault or other sort of related fault/exception occurs, so
that by the time that it's multitasked again, no data useful for
Meltdown or Spectre is discernible.