Re: [PATCH v10 6/6] x86/split_lock: Enable split lock detection by kernel parameter

From: Sean Christopherson
Date: Mon Nov 25 2019 - 11:13:51 EST


On Fri, Nov 22, 2019 at 04:30:56PM -0800, Luck, Tony wrote:
> On Fri, Nov 22, 2019 at 04:27:15PM +0100, Peter Zijlstra wrote:
>
> This all looks dubious on an HT system .... three snips
> from your patch:
>
> > +static bool __sld_msr_set(bool on)
> > +{
> > + u64 test_ctrl_val;
> > +
> > + if (rdmsrl_safe(MSR_TEST_CTRL, test_ctrl_val))
> > + return false;
> > +
> > + if (on)
> > + test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> > + else
> > + test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> > +
> > + if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val))
> > + return false;
> > +
> > + return true;
> > +}
>
> > +void switch_sld(struct task_struct *prev)
> > +{
> > + __sld_set_msr(true);
> > + clear_tsk_thread_flag(current, TIF_CLD);
> > +}
>
> > @@ -654,6 +654,9 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p)
> > /* Enforce MSR update to ensure consistent state */
> > __speculation_ctrl_update(~tifn, tifn);
> > }
> > +
> > + if (tifp & _TIF_SLD)
> > + switch_sld(prev_p);
> > }
>
> Don't you have some horrible races between the two logical
> processors on the same core as they both try to set/clear the
> MSR that is shared at the core level?

Yes and no. Yes, there will be races, but they won't be fatal in any way.

- Only the split-lock bit is supported by the kernel, so there isn't a
risk of corrupting other bits as both threads will rewrite the current
hardware value.

- Toggling of split-lock is only done in "warn" mode. Worst case
scenario of a race is that a misbehaving task will generate multiple
#AC exceptions on the same instruction. And this race will only occur
if both siblings are running tasks that generate split-lock #ACs, e.g.
a race where sibling threads are writing different values will only
occur if CPUx is disabling split-lock after an #AC and CPUy is
re-enabling split-lock after *its* previous task generated an #AC.

- Transitioning between modes at runtime isn't supported and disabling
is tracked per task, so hardware will always reach a steady state that
matches the configured mode. I.e. split-lock is guaranteed to be
enabled in hardware once all _TIF_SLD threads have been scheduled out.