Re: [PATCH v10 6/6] x86/split_lock: Enable split lock detection by kernel parameter
From: Peter Zijlstra
Date: Fri Nov 22 2019 - 15:30:17 EST
On Fri, Nov 22, 2019 at 10:44:57AM -0800, Sean Christopherson wrote:
> On Fri, Nov 22, 2019 at 04:27:15PM +0100, Peter Zijlstra wrote:
> > On Fri, Nov 22, 2019 at 11:51:41AM +0100, Peter Zijlstra wrote:
> >
> > > A non-lethal default enabled variant would be even better for them :-)
> >
> > diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
> > index d779366ce3f8..d23638a0525e 100644
> > --- a/arch/x86/include/asm/thread_info.h
> > +++ b/arch/x86/include/asm/thread_info.h
> > @@ -92,6 +92,7 @@ struct thread_info {
> > #define TIF_NOCPUID 15 /* CPUID is not accessible in userland */
> > #define TIF_NOTSC 16 /* TSC is not accessible in userland */
> > #define TIF_IA32 17 /* IA32 compatibility process */
> > +#define TIF_SLD 18 /* split_lock_detect */
>
> Maybe use SLAC (Split-Lock AC) as the acronym? I can't help but read
> SLD as "split-lock disabled". And name this TIF_NOSLAC (or TIF_NOSLD if
> you don't like SLAC) since it's set when the task is running without #AC?
I'll take any other name, really. I was typing in a hurry and my
pick-a-sensible-name generator was definitely not running.
> > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> > index bd2a11ca5dd6..c04476a1f970 100644
> > --- a/arch/x86/kernel/process.c
> > +++ b/arch/x86/kernel/process.c
> > @@ -654,6 +654,9 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p)
> > /* Enforce MSR update to ensure consistent state */
> > __speculation_ctrl_update(~tifn, tifn);
> > }
> > +
> > + if (tifp & _TIF_SLD)
> > + switch_sld(prev_p);
> > }
>
> Re-enabling #AC when scheduling out the misbehaving task would also work
> well for KVM, e.g. call a variant of handle_user_split_lock() on an
> unhandled #AC in the guest.
Iinitially I thought having a timer to re-enable it, but this also
works. We really shouldn't be hitting this much. And any actual
occurence needs to be investigated and fixed anyway.
I've not thought much about guests, that's not really my thing. But I'll
think about it a bit :-)
> > +dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code)
> > +{
> > + unsigned int trapnr = X86_TRAP_AC;
> > + char str[] = "alignment check";
> > + int signr = SIGBUS;
> > +
> > + RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> > +
> > + if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) == NOTIFY_STOP)
> > + return;
> > +
> > + if (!handle_split_lock())
>
> Pretty sure this should be omitted entirely.
Yes, I just wanted to early exit the thing for !SUP_INTEL.
> For an #AC in the kernel,
> simply restarting the instruction will fault indefinitely, e.g. dieing is
> probably the best course of action if a (completely unexpteced) #AC occurs
> in "off" mode. Dropping this check also lets handle_user_split_lock() do
> the right thing for #AC due to EFLAGS.AC=1 (pointed out by Tony).
Howveer I'd completely forgotten about EFLAGS.AC.
> > + return;
> > +
> > + if (!user_mode(regs))
> > + die("Split lock detected\n", regs, error_code);
> > +
> > + cond_local_irq_enable(regs);
> > +
> > + if (handle_user_split_lock(regs, error_code))
> > + return;
> > +
> > + do_trap(X86_TRAP_AC, SIGBUS, "alignment check", regs,
> > + error_code, BUS_ADRALN, NULL);
> > +}