Re: [PATCH v6 8/8] kvm: vmx: virtualize split lock detection

From: Thomas Gleixner
Date: Tue Mar 24 2020 - 21:42:11 EST


Xiaoyao Li <xiaoyao.li@xxxxxxxxx> writes:
> On 3/25/2020 8:40 AM, Thomas Gleixner wrote:
>>> if (!split_lock_detect_on() ||
>>> + guest_cpu_split_lock_detect_on(vmx) ||
>>> guest_cpu_alignment_check_enabled(vcpu)) {
>>
>> If the host has split lock detection disabled then how is the guest
>> supposed to have it enabled in the first place?
>
> So we need to reach an agreement on whether we need a state that host
> turns it off but feature is available to be exposed to guest.

There is a very simple agreement:

If the host turns it off, then it is not available at all

If the host sets 'warn' then this applies to everything

If the host sets 'fatal' then this applies to everything

Make it simple and consistent.

>>> + if (static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
>>> + guest_cpu_split_lock_detect_on(vmx)) {
>>> + if (test_thread_flag(TIF_SLD))
>>> + sld_turn_back_on();
>>
>> This is completely inconsistent behaviour. The only way that TIF_SLD is
>> set is when the host has sld_state == sld_warn and the guest triggered
>> a split lock #AC.
>
> Can you image the case that both host and guest set sld_state == sld_warn.
>
> 1. There is guest userspace thread causing split lock.
> 2. It sets TIF_SLD for the thread in guest, and clears SLD bit to re-
> execute the instruction in guest.
> 3. Then it still causes #AC since hardware SLD is not cleared. In host
> kvm, we call handle_user_split_lock() that sets TIF_SLD for this VMM
> thread, and clears hardware SLD bit. Then it enters guest and re-execute
> the instruction.
> 4. In guest, it schedules to another thread without TIF_SLD being set.
> it sets the SLD bit to detect the split lock for this thread. So for
> this purpose, we need to turn sld back on for the VMM thread, otherwise
> this guest vcpu cannot catch split lock any more.

If you really want to address that scenario, then why are you needing
any of those completely backwards interfaces at all?

Just because your KVM exception trap uses the host handling function
which sets TIF_SLD?

Please sit down, go back to the drawing board and figure it out how to
solve that without creating inconsistent state and duct tape functions
to deal with that.

I'm not rewriting the patches for you this time.

> Do you need me to spin a new version of patch 1 to clear SLD bit on APs
> if SLD_OFF?

I assume you can answer that question yourself.

Thanks,

tglx