Re: [RFC] x86: Avoid CR3 load on compatibility mode with PTI

From: Nadav Amit
Date: Mon Jan 15 2018 - 22:49:25 EST


Ingo Molnar <mingo@xxxxxxxxxx> wrote:

>
> * Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
>
>>> Also, what's the end goal here? Run old 32-bit binaries better? You
>>> want to weaken the security of the whole implementation to do that?
>>> Sounds like a bad tradeoff to me.
>>
>> As Willy noted in this thread, I think that some users may be interested in
>> running 32-bit Apache/Nginx/Redis to get the performance back without
>> sacrificing security.
>
> Note that it is a flawed assumption to think that this is possible, as they might
> in many cases not be getting their performance back: 32-bit binaries for the same
> general CPU bound computation can easily be 5% slower than 64-bit binaries (as
> long as the larger cache footprint of 64-bit data doesn't fall out of key caches),
> but can be up to 30% slower for certain computations.
>
> In fact, depending on how kernel heavy the web workload is (for example how much
> CGI processing versus IO it does, etc.), a 32-bit binary could be distinctly
> _slower_ than even a PTI-enabled 64-bit binary.

Obviously you are right - I didnât argue otherwise - and I think it is also
reflected in the results (Redis LRANGE results). Yet, arguably the workloads
that are affected the most by PTI are those with a high number of syscalls
and interrupts, in which user computation time is relatively small.

> So we are trading a 5-15% slowdown (PTI) for another 5-15% slowdown, plus we are
> losing the soft-SMEP feature on older CPUs that PTI enables, which is a pretty
> powerful mitigation technique.

This soft-SMEP can be kept by keeping PTI if SMEP is unsupported. Although
we trade slowdowns, they are different ones, which allows the user to make
his best decision.

> Yes, I suspect in some (maybe many) cases it would be a speedup, but I really
> don't like the underlying assumptions and tradeoffs here. (Not that I like any of
> this whole Meltdown debacle TBH.)

To make sure that I understand correctly - the assumptions are that
disabling PTI on compatibility mode would: (1) Benefit some workloads; (2)
Be useful, even if we only consider CPUs with SMEP; and (3) Secure.

Under these assumptions, the tradeoff is slightly greater code complexity
for considerably better performance of 32-bit code; in some common cases
this makes 32-bit code to perform significantly better than 64-bit code.

Am I missing something? My main concern was initially security, but so far
from your aggregated feedback I did not see something concrete which cannot
relatively easily be addressed.