Re: [PATCH v10 6/6] x86/split_lock: Enable split lock detection by kernel parameter
From: Peter Zijlstra
Date: Thu Dec 12 2019 - 04:00:02 EST
On Mon, Nov 25, 2019 at 08:13:48AM -0800, Sean Christopherson wrote:
> On Fri, Nov 22, 2019 at 04:30:56PM -0800, Luck, Tony wrote:
> > On Fri, Nov 22, 2019 at 04:27:15PM +0100, Peter Zijlstra wrote:
> >
> > This all looks dubious on an HT system .... three snips
> > from your patch:
> >
> > > +static bool __sld_msr_set(bool on)
> > > +{
> > > + u64 test_ctrl_val;
> > > +
> > > + if (rdmsrl_safe(MSR_TEST_CTRL, test_ctrl_val))
> > > + return false;
> > > +
> > > + if (on)
> > > + test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> > > + else
> > > + test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> > > +
> > > + if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val))
> > > + return false;
> > > +
> > > + return true;
> > > +}
> >
> > > +void switch_sld(struct task_struct *prev)
> > > +{
> > > + __sld_set_msr(true);
> > > + clear_tsk_thread_flag(current, TIF_CLD);
> > > +}
> >
> > > @@ -654,6 +654,9 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p)
> > > /* Enforce MSR update to ensure consistent state */
> > > __speculation_ctrl_update(~tifn, tifn);
> > > }
> > > +
> > > + if (tifp & _TIF_SLD)
> > > + switch_sld(prev_p);
> > > }
> >
> > Don't you have some horrible races between the two logical
> > processors on the same core as they both try to set/clear the
> > MSR that is shared at the core level?
>
> Yes and no. Yes, there will be races, but they won't be fatal in any way.
>
> - Only the split-lock bit is supported by the kernel, so there isn't a
> risk of corrupting other bits as both threads will rewrite the current
> hardware value.
>
> - Toggling of split-lock is only done in "warn" mode. Worst case
> scenario of a race is that a misbehaving task will generate multiple
> #AC exceptions on the same instruction. And this race will only occur
> if both siblings are running tasks that generate split-lock #ACs, e.g.
> a race where sibling threads are writing different values will only
> occur if CPUx is disabling split-lock after an #AC and CPUy is
> re-enabling split-lock after *its* previous task generated an #AC.
>
> - Transitioning between modes at runtime isn't supported and disabling
> is tracked per task, so hardware will always reach a steady state that
> matches the configured mode. I.e. split-lock is guaranteed to be
> enabled in hardware once all _TIF_SLD threads have been scheduled out.
Just so, thanks for clarifying.