Re: [PATCH v9 3/8] x86/split_lock: Introduce flag X86_FEATURE_SLD_FATAL and drop sld_state

From: Sean Christopherson
Date: Mon May 11 2020 - 14:17:46 EST


On Sat, May 09, 2020 at 10:14:02PM -0700, Andy Lutomirski wrote:
> On Fri, May 8, 2020 at 8:03 PM Xiaoyao Li <xiaoyao.li@xxxxxxxxx> wrote:
> >
> > Introduce a synthetic feature flag X86_FEATURE_SLD_FATAL, which means
> > kernel is in sld_fatal mode if set.
> >
> > Now sld_state is not needed any more that the state of SLD can be
> > inferred from X86_FEATURE_SPLIT_LOCK_DETECT and X86_FEATURE_SLD_FATAL.
>
> Is it too much to ask for Intel to actually allocate and define a
> CPUID bit that means "this CPU *always* sends #AC on a split lock"?
> This would be a pure documentation change, but it would make this
> architectural rather than something that each hypervisor needs to hack
> up.

The original plan was to request a bit in MSR_TEST_CTRL be documented as
such. Then we discovered that defining IA32_CORE_CAPABILITIES enumeration
as architectural was an SDM bug[*]. At that point, enumerating SLD to a
KVM guest through a KVM CPUID leaf is the least awful option. Emulating the
model specific behavior doesn't provide userspace with a sane way to disable
SLD for a guest, and emulating IA32_CORE_CAPABILITIES behavior would be
tantamount to emulating model specific behavior.

Once paravirt is required for basic SLD enumeration, tacking on the "fatal"
indicator is a minor blip.

I agree that having to reinvent the wheel for every hypervisor is completely
ridiculous, but it provides well defined and controllable behavior. We
could try to get two CPUID bits defined in the SDM, but pushing through all
the bureaucracy that gates SDM changes means we wouldn't have a resolution
for at least multiple months, assuming the proposal was even accepted.

[*] https://lkml.kernel.org/r/20200416205754.21177-3-tony.luck@xxxxxxxxx

> Meanwhile, I don't see why adding a cpufeature flag is worthwhile to
> avoid a less bizarre global variable. There's no performance issue
> here, and the old code looked a lot more comprehensible than the new
> code.

The flag has two main advantages:

- Automatically available to modules, i.e. KVM.
- Visible to userspace in /proc/cpuinfo.

Making the global variable available to KVM is ugly because it either
requires exporting the variable and the enums (which gets especially nasty
because kvm_intel can be built with CONFIG_CPU_SUP_INTEL=n), or requires
adding a dedicated is_sld_fatal() wrapper and thus more exported crud.

And IMO, the feature flag is the less bizarre option once it's "public"
knowledge, e.g. more in line with features that enumerate both basic support
and sub-features via CPUID bits.