Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs
From: Avi Kivity
Date: Sat Jan 06 2018 - 14:33:36 EST
Meltdown and Spectre mitigations focus on protecting the kernel from a
hostile userspace. However, it's not a given that the kernel is the most
important target in the system. It is common in server workloads that a
single userspace application contains the valuable data on a system, and
if it were hostile, the game would already be over, without the need to
compromise the kernel.
In these workloads, a single application performs most system calls, and
so it pays the cost of protection, without benefiting from it directly
(since it is the target, rather than the kernel).
I propose to create a new capability, CAP_PAYLOAD, that allows the
system administrator to designate an application as the main workload in
that system. Other processes (like sshd or monitoring daemons) exist to
support it, and so it makes sense to protect the rest of the system from
their being compromised.
When the kernel switches to user mode of a CAP_PAYLOAD process, it
doesn't switch page tables and instead leaves the kernel mapped into the
adddress space (still with supervisor protection, of course). This
reduces context switch cost, and will also reduce interrupt costs if the
interrupt happens while that process executes. Since a CAP_PAYLOAD
process is likely to consume the majority of CPU time, the costs
associated with Meltdown mitigation are almost completely nullified.
CAP_PAYLOAD has potential to be abused; every software vendor will be
absolutely certain that their application is the reason the universe
(let alone that server) exists and they will turn it on, so init systems
will have to work to make it doesn't get turned on without administrator
opt-in. It's also not perfect, since if there is a payload application
compromise, in addition to stealing the application's data ssh keys can
be stolen too. But I think it's better than having to choose between
significantly reduced performance and security. You get performance for
your important application, and protection against the possibility that
a remote exploit against a supporting process turns into a remote
exploit against that important application.