Re: [PATCH 1/3] x86/debug: Fix BTF handling

From: Peter Zijlstra
Date: Wed Oct 28 2020 - 17:45:01 EST


On Wed, Oct 28, 2020 at 06:20:25PM +0900, Masami Hiramatsu wrote:
> On Tue, 27 Oct 2020 20:41:26 +0100
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> > On Tue, Oct 27, 2020 at 10:15:05AM +0100, Peter Zijlstra wrote:
> >
> > > @@ -873,6 +866,20 @@ static __always_inline void exc_debug_ke
> > > */
> > > WARN_ON_ONCE(user_mode(regs));
> > >
> > > + if (test_thread_flag(TIF_BLOCKSTEP)) {
> > > + /*
> > > + * The SDM says "The processor clears the BTF flag when it
> > > + * generates a debug exception." but PTRACE_BLOCKSTEP requested
> > > + * it for userspace, but we just took a kernel #DB, so re-set
> > > + * BTF.
> > > + */
> > > + unsigned long debugctl;
> > > +
> > > + rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> > > + debugctl |= DEBUGCTLMSR_BTF;
> > > + wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> > > + }
> > > +
> > > /*
> > > * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
> > > * watchpoint at the same time then that will still be handled.
> >
> > Masami, how does BTF interact with !optimized kprobes that single-step?
>
> Good question, BTF is cleared right before single-stepping and restored
> after single-stepping. It will be done accoding to TIF_BLOCKSTEP bit as below.
>
> (in arch/x86/kernel/kprobes/core.c)
>
> static nokprobe_inline void clear_btf(void)
> {
> if (test_thread_flag(TIF_BLOCKSTEP)) {
> unsigned long debugctl = get_debugctlmsr();
>
> debugctl &= ~DEBUGCTLMSR_BTF;
> update_debugctlmsr(debugctl);
> }
> }
>
> static nokprobe_inline void restore_btf(void)
> {
> if (test_thread_flag(TIF_BLOCKSTEP)) {
> unsigned long debugctl = get_debugctlmsr();
>
> debugctl |= DEBUGCTLMSR_BTF;
> update_debugctlmsr(debugctl);
> }
> }
>
> Hrm, so it seems that we do same ... maybe we don't need clear_btf() too?

No, I think you do very much need clear_btf(). But with my patch perhaps
restore_btf() is no longer needed. Is there only a single single-step
between setup_singlestep() and resume_execution() ? (I think so).

Also, I note that we should employ get_debugctlmsr() more consistently.

> > The best answer I can come up with is 'poorly' :/
>
> Is this what you expected? :)

Nah, I missed the above, you seems to do the right thing.