Re: [RFC,05/10] x86/speculation: Add basic IBRS support infrastructure

From: Christophe de Dinechin
Date: Wed Jan 31 2018 - 06:07:38 EST




> On 31 Jan 2018, at 11:15, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> On Wed, 31 Jan 2018, Christophe de Dinechin wrote:
>>> On 30 Jan 2018, at 21:46, Alan Cox <gnomes@xxxxxxxxxxxxxxxxxxx> wrote:
>>>
>>>> If you are ever going to migrate to Skylake, I think you should just
>>>> always tell the guests that you're running on Skylake. That way the
>>>> guests will always assume the worst case situation wrt Specte.
>>>
>>> Unfortunately if you do that then guest may also decide to use other
>>> Skylake hardware features and pop its clogs when it finds out its actually
>>> running on Westmere or SandyBridge.
>>>
>>> So you need to be able to both lie to the OS and user space via cpuid and
>>> also have a second 'but do skylake protections' that only mitigation
>>> aware software knows about.
>>
>> Yes. The most desirable lie is different depending on whether you want to
>> allow virtualization features such as migration (where youâd gravitate
>> towards a CPU with less features) or whether you want to allow mitigation
>> (where youâd rather present the most fragile CPUID, probably Skylake).
>>
>> Looking at some recent patches, Iâm concerned that the code being added
>> often assumes that the CPUID is the correct way to get that info.
>> I do not think this is correct. You really want specific information about
>> the host CPUID, not whatever KVM CPUID emulation makes up.
>
> That wont cut it. If you have a heterogenous farm of systems, then you need:
>
> - All CPUs have to support IBRS/IBPB or at least hte hypervisor has to
> pretend they do by providing fake MRS for that
>
> - Have a 'force IBRS/IBPB' mechanism so the guests don't discard it due
> to missing CPU feature bits.
>
> Though this gets worse. You have to make sure that the guest keeps _ALL_
> sorts of mitigation mechanisms enabled and does not decide to disable
> retpolines because IBRS/IBPB are "availableâ.

What you are saying is that itâs one thing to test at boot time, but
(at least) migration events should also cause a re-check. Agreed.
The alternative is to pessimistically enable mitigation in VMs.
I believe this is the current âstate of the artâ, i.e. enable
IBRS statically via a CPU type variant.

What is the best place to re-check anyway?

(Just out of curiosity: there are no non-symmetric systems
that mix CPUs of different generation, right?)


>
> Good luck with making all that work.

:-)

>
> Thanks,
>
> tglx