Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split lock

From: Sean Christopherson
Date: Wed Oct 16 2019 - 13:42:03 EST


On Wed, Oct 16, 2019 at 09:23:37AM -0700, Sean Christopherson wrote:
> On Wed, Oct 16, 2019 at 05:43:53PM +0200, Paolo Bonzini wrote:
> > On 16/10/19 17:41, Sean Christopherson wrote:
> > > On Wed, Oct 16, 2019 at 04:08:14PM +0200, Paolo Bonzini wrote:
> > >> SIGBUS (actually a new KVM_EXIT_INTERNAL_ERROR result from KVM_RUN is
> > >> better, but that's the idea) is for when you're debugging guests.
> > >> Global disable (or alternatively, disable SMT) is for production use.
> > >
> > > Alternatively, for guests without split-lock #AC enabled, what if KVM were
> > > to emulate the faulting instruction with split-lock detection temporarily
> > > disabled?
> >
> > Yes we can get fancy, but remember that KVM is not yet supporting
> > emulation of locked instructions. Adding it is possible but shouldn't
> > be in the critical path for the whole feature.
>
> Ah, didn't realize that. I'm surprised emulating all locks with cmpxchg
> doesn't cause problems (or am I misreading the code?). Assuming I'm
> reading the code correctly, the #AC path could kick all other vCPUS on
> emulation failure and then retry emulation to "guarantee" success. Though
> that's starting to build quite the house of cards.

Ugh, doesn't the existing emulation behavior create another KVM issue?
KVM uses a locked cmpxchg in emulator_cmpxchg_emulated() and the address
is guest controlled, e.g. a guest could coerce the host into disabling
split-lock detection via the host's #AC handler by triggering emulation
and inducing an #AC in the emulator.

> > How would you disable split-lock detection temporarily? Just tweak
> > MSR_TEST_CTRL for the time of running the one instruction, and cross
> > fingers that the sibling doesn't notice?
>
> Tweak MSR_TEST_CTRL, with logic to handle the scenario where split-lock
> detection is globally disable during emulation (so KVM doesn't
> inadvertantly re-enable it).
>
> There isn't much for the sibling to notice. The kernel would temporarily
> allow split-locks on the sibling, but that's a performance issue and isn't
> directly fatal. A missed #AC in the host kernel would only delay the
> inevitable global disabling of split-lock. A missed #AC in userspace would
> again just delay the inevitable SIGBUS.