Re: [PATCH v2] x86/split_lock: Handle unexpected split lock as fatal

From: Xiaoyao Li

Date: Wed Jan 07 2026 - 21:11:52 EST


On 1/7/2026 11:19 PM, Dave Hansen wrote:
On 1/7/26 05:49, Xiaoyao Li wrote:
+ /*
+ * If #AC occurs on split lock without X86_FEATURE_SPLIT_LOCK_DETECT
+ * the kernel cannot handle it by disabling the detection. Treat it as
+ * fatal regardless of the sld_state.
+ */
+ if (!cpu_feature_enabled(X86_FEATURE_SPLIT_LOCK_DETECT))
+ return true;

If #AC occurs on split lock without X86_FEATURE_SPLIT_LOCK_DETECT, that
sounds more like a naughty hypervisor or buggy CPU that deserves a
BUG_ON() rather than a situation where the kernel wants to move merrily
along.

Yes. Such behavior is non-architectural.
1) If it happens on bare metal, the CPU is broken.
2) If it happens in guest, the hypervisor does something wrong.

This also needs an explanation in the changelog about _why_
X86_FEATURE_SPLIT_LOCK_DETECT isn't set and can't be set. It needs to
explain why enumeration is not present *AND* is impossible to add.

The only case I know, where such non-architectural behavior can happen is TDX guest. It's a virtualization case and X86_FEATURE_SPLIT_LOCK_DETECT cannot be virtualized normally in a sane manner because MSR_TEST_CTRL is a per-core scope MSR. Enumerating X86_FEATURE_SPLIT_LOCK_DETECT to a guest means the guest is able to enable/disable the feature freely by its own. However, on the HT system, if the guest disables the feature for its vcpu, it will also disable the feature for the sibling CPU on the same core, where the host processes or other VMs might run. Even on non-HT system, allowing the guest to disable the feature will violate the host purpose of not getting any split lock when host sets to fatal mode.

On the other hand, the question can be "why getting #AC on the split lock if the feature is not available? and if it can be fixed to not get #AC?" For this question,

1) if it happens on bare metal, the CPU is broken. The kernel cannot fix it.

2) if it happens in guest, it should be the hypervisor enables the feature in hardward MSR when the guest is running. To fix it, the hypervisor can intercept the #AC and handle it itself instead of letting the #AC be delivered to the guest. This is what KVM already does for normal guests. However, for TDX guest, KVM cannot intercept #AC. It needs changes in TDX module to provide such ability.