Re: Improve retpoline for Skylake

From: Andrew Cooper
Date: Mon Jan 15 2018 - 12:38:08 EST


On 15/01/18 16:57, Andy Lutomirski wrote:
>
>> On Jan 15, 2018, at 12:26 AM, Jon Masters <jcm@xxxxxxxxxxxxxx> wrote:
>>
>>> On 01/12/2018 05:03 PM, Henrique de Moraes Holschuh wrote:
>>> On Fri, 12 Jan 2018, Andi Kleen wrote:
>>>>> Skylake still loses if it takes an SMI, right?
>>>> SMMs are usually rare, especially on servers, and are usually
>>>> not very predictible, and even if you have
>>> FWIW, a data point: SMIs can be generated on demand by userspace on
>>> thinkpad laptops, but they will be triggered from within a kernel
>>> context. I very much doubt this is a rare pattern...
>> Sure. Just touch some "legacy" hardware that the vendor emulates in a
>> nasty SMI handler. It's definitely not acceptable to assume that SMIs
>> can't be generated under the control of some malicious user code.
>>
>> Our numbers on Skylake weren't bad, and there seem to be all kinds of
>> corner cases, so again, it seems as if IBRS is the safest choice.
>>
> And keep in mind that SMIs generally hit all CPUs at once, making them extra nasty.
>
> Can we get firmware vendors to refill the return buffer just before RSM?

Refill or not, you are aware that a correctly timed SMI in a leaf
function will cause the next ret to speculate into userspace, because
there is guaranteed peturbance in the RSB? (On the expectation that the
SMM handler isn't entirely devoid of function calls).

Having firmware refill the RSB only makes a difference if you are on
Skylake+ were RSB underflows are bad, and you're not using IBRS to
protect your indirect predictions.

~Andrew