Re: [PATCH v3] x86/speculation, KVM: only IBPB for switch_mm_always_ibpb on vCPU load

From: Borislav Petkov
Date: Fri Apr 29 2022 - 18:22:34 EST


On Fri, Apr 29, 2022 at 09:59:52PM +0000, Sean Christopherson wrote:
> Correct, but KVM also doesn't do IBPB on VM-Exit (or VM-Entry),

Why doesn't it do that? Not needed?

> nor does KVM do IBPB before exiting to userspace.

Same question.

> The IBPB we want to whack is issued only when KVM is switching vCPUs.

Then please document it properly as I've already requested.

> Except that _none_ of that documentation explains why the hell KVM
> does IBPB when switching betwen vCPUs.

Probably because the folks involved in those patches weren't the hell
mainly virt people. Although I see a bunch of virt people on CC on that
patch.

> : But stepping back, why does KVM do its own IBPB in the first place?  The goal is
> : to prevent one vCPU from attacking the next vCPU run on the same pCPU.  But unless
> : userspace is running multiple VMs in the same process/mm_struct, switching vCPUs,
> : i.e. switching tasks, will also switch mm_structs and thus do IPBP via cond_mitigation.
> :
> : If userspace runs multiple VMs in the same process,

This keeps popping up. Who does that? Can I get a real-life example to
such VM-based containers or what the hell that is, pls?

> enables cond_ipbp, _and_ sets
> : TIF_SPEC_IB, then it's being stupid and isn't getting full protection in any case,
> : e.g. if userspace is handling an exit-to-userspace condition for two vCPUs from
> : different VMs, then the kernel could switch between those two vCPUs' tasks without
> : bouncing through KVM and thus without doing KVM's IBPB.
> :
> : I can kinda see doing this for always_ibpb, e.g. if userspace is unaware of spectre
> : and is naively running multiple VMs in the same process.

So this needs a clearer definition: what protection are we even talking
about when the address spaces of processes are shared? My naïve
thinking would be: none. They're sharing address space - branch pred.
poisoning between the two is the least of their worries.

So to cut to the chase: it sounds to me like you don't want to do IBPB
at all on vCPU switch. And the process switch case is taken care of by
switch_mm().

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette