On Wed, Oct 16, 2019 at 09:23:37AM -0700, Sean Christopherson wrote:
On Wed, Oct 16, 2019 at 05:43:53PM +0200, Paolo Bonzini wrote:
On 16/10/19 17:41, Sean Christopherson wrote:
On Wed, Oct 16, 2019 at 04:08:14PM +0200, Paolo Bonzini wrote:
SIGBUS (actually a new KVM_EXIT_INTERNAL_ERROR result from KVM_RUN is
better, but that's the idea) is for when you're debugging guests.
Global disable (or alternatively, disable SMT) is for production use.
Alternatively, for guests without split-lock #AC enabled, what if KVM were
to emulate the faulting instruction with split-lock detection temporarily
disabled?
Yes we can get fancy, but remember that KVM is not yet supporting
emulation of locked instructions. Adding it is possible but shouldn't
be in the critical path for the whole feature.
Ah, didn't realize that. I'm surprised emulating all locks with cmpxchg
doesn't cause problems (or am I misreading the code?). Assuming I'm
reading the code correctly, the #AC path could kick all other vCPUS on
emulation failure and then retry emulation to "guarantee" success. Though
that's starting to build quite the house of cards.
Ugh, doesn't the existing emulation behavior create another KVM issue?
KVM uses a locked cmpxchg in emulator_cmpxchg_emulated() and the address
is guest controlled, e.g. a guest could coerce the host into disabling
split-lock detection via the host's #AC handler by triggering emulation
and inducing an #AC in the emulator.
How would you disable split-lock detection temporarily? Just tweak
MSR_TEST_CTRL for the time of running the one instruction, and cross
fingers that the sibling doesn't notice?
Tweak MSR_TEST_CTRL, with logic to handle the scenario where split-lock
detection is globally disable during emulation (so KVM doesn't
inadvertantly re-enable it).
There isn't much for the sibling to notice. The kernel would temporarily
allow split-locks on the sibling, but that's a performance issue and isn't
directly fatal. A missed #AC in the host kernel would only delay the
inevitable global disabling of split-lock. A missed #AC in userspace would
again just delay the inevitable SIGBUS.