Re: [RFD] x86/split_lock: Request to Intel

From: Sean Christopherson
Date: Thu Oct 17 2019 - 19:38:26 EST


On Thu, Oct 17, 2019 at 11:31:15PM +0200, Thomas Gleixner wrote:
> On Thu, 17 Oct 2019, Sean Christopherson wrote:
> > On Thu, Oct 17, 2019 at 02:29:45PM +0200, Thomas Gleixner wrote:
> > > The more I look at this trainwreck, the less interested I am in merging any
> > > of this at all.
> > >
> > > The fact that it took Intel more than a year to figure out that the MSR is
> > > per core and not per thread is yet another proof that this industry just
> > > works by pure chance.
> > >
> > > There is a simple way out of this misery:
> > >
> > > Intel issues a microcode update which does:
> > >
> > > 1) Convert the OR logic of the AC enable bit in the TEST_CTRL MSR to
> > > AND logic, i.e. when one thread disables AC it's automatically
> > > disabled on the core.
> > >
> > > Alternatively it supresses the #AC when the current thread has it
> > > disabled.
> > >
> > > 2) Provide a separate bit which indicates that the AC enable logic is
> > > actually AND based or that #AC is supressed when the current thread
> > > has it disabled.
> > >
> > > Which way I don't really care as long as it makes sense.
> >
> > The #AC bit doesn't use OR-logic, it's straight up shared, i.e. writes on
> > one CPU are immediately visible on its sibling CPU.
>
> That's less horrible than I read out of your initial explanation.
>
> Thankfully all of this is meticulously documented in the SDM ...

Preaching to the choir on this one...

> Though it changes the picture radically. The truly shared MSR allows
> regular software synchronization without IPIs and without an insane amount
> of corner case handling.
>
> So as you pointed out we need a per core state, which is influenced by:
>
> 1) The global enablement switch
>
> 2) Host induced #AC
>
> 3) Guest induced #AC
>
> A) Guest has #AC handling
>
> B) Guest has no #AC handling
>
> #1:
>
> - OFF: #AC is globally disabled
>
> - ON: #AC is globally enabled
>
> - FORCE: same as ON but #AC is enforced on guests
>
> #2:
>
> If the host triggers an #AC then the #AC has to be force disabled on the
> affected core independent of the state of #1. Nothing we can do about
> that and once the initial wave of #AC issues is fixed this should not
> happen on production systems. That disables #3 even for the #3.A case
> for simplicity sake.
>
> #3:
>
> A) Guest has #AC handling
>
> #AC is forwarded to the guest. No further action required aside of
> accounting
>
> B) Guest has no #AC handling
>
> If #AC triggers the resulting action depends on the state of #1:
>
> - FORCE: Guest is killed with SIGBUS or whatever the virt crowd
> thinks is the appropriate solution
> - ON: #AC triggered state is recorded per vCPU and the MSR is
> toggled on VMENTER/VMEXIT in software from that point on.
>
> So the only interesting case is #3.B and #1.state == ON. There you need
> serialization of the state and the MSR write between the cores, but only
> when the vCPU triggered an #AC. Until then, nothing to do.

And "vCPU triggered an #AC" should include an explicit check in KVM's
emulator.

> vmenter()
> {
> if (vcpu->ac_disable)
> this_core_disable_ac();
> }
>
> vmexit()
> {
> if (vcpu->ac_disable) {
> this_core_enable_ac();
> }
>
> this_core_dis/enable_ac() takes the global state into account and has the
> necessary serialization in place.

Overall, looks good to me. Although Tony's mail makes it obvious we need
to sync internally...