Re: Improve retpoline for Skylake

From: Andy Lutomirski
Date: Mon Jan 15 2018 - 13:06:48 EST




> On Jan 15, 2018, at 9:38 AM, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>
>> On 15/01/18 16:57, Andy Lutomirski wrote:
>>
>>>> On Jan 15, 2018, at 12:26 AM, Jon Masters <jcm@xxxxxxxxxxxxxx> wrote:
>>>>
>>>> On 01/12/2018 05:03 PM, Henrique de Moraes Holschuh wrote:
>>>> On Fri, 12 Jan 2018, Andi Kleen wrote:
>>>>>> Skylake still loses if it takes an SMI, right?
>>>>> SMMs are usually rare, especially on servers, and are usually
>>>>> not very predictible, and even if you have
>>>> FWIW, a data point: SMIs can be generated on demand by userspace on
>>>> thinkpad laptops, but they will be triggered from within a kernel
>>>> context. I very much doubt this is a rare pattern...
>>> Sure. Just touch some "legacy" hardware that the vendor emulates in a
>>> nasty SMI handler. It's definitely not acceptable to assume that SMIs
>>> can't be generated under the control of some malicious user code.
>>>
>>> Our numbers on Skylake weren't bad, and there seem to be all kinds of
>>> corner cases, so again, it seems as if IBRS is the safest choice.
>>>
>> And keep in mind that SMIs generally hit all CPUs at once, making them extra nasty.
>>
>> Can we get firmware vendors to refill the return buffer just before RSM?
>
> Refill or not, you are aware that a correctly timed SMI in a leaf
> function will cause the next ret to speculate into userspace, because
> there is guaranteed peturbance in the RSB? (On the expectation that the
> SMM handler isn't entirely devoid of function calls).

Couldn't firmware fill the RSB with a some known safe address, maybe even 0, and then immediately do RSM?



>
> Having firmware refill the RSB only makes a difference if you are on
> Skylake+ were RSB underflows are bad, and you're not using IBRS to
> protect your indirect predictions.
>
> ~Andrew