Re: [REGRESSION] x86/debug: After PTRACE_SINGLESTEP DR_STEP is no longer reported in dr6

From: Andy Lutomirski
Date: Mon Oct 26 2020 - 19:30:51 EST


On Mon, Oct 26, 2020 at 9:55 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Mon, Oct 26, 2020 at 05:31:00PM +0100, Peter Zijlstra wrote:
> > In that respect, I think the current virtual_dr6 = 0 is placed wrong, it
> > should only be in exc_debug_user(). The only 'problem' then is that we
> > seem to be able to loose BTF, but perhaps that is already an extant bug.
> >
> > Consider:
> >
> > - perf: setup in-kernel #DB
> > - tracer: ptrace(PTRACE_SINGLEBLOCK)
> > - tracee: #DB on perf breakpoint, looses BTF
> > - tracee .. never triggers actual blockstep
> >
> > Hmm ? Should we re-set BTF when TIF_BLOCKSTEP && !user_mode(regs) ?
>
> Something like so then.
>
> Or sould we also have the userspace #DB re-set BTF when it was !DR_STEP?
> I need to go untangle that ptrace stuff :/
>
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 3c70fb34028b..31de8b0980ca 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -793,19 +793,6 @@ static __always_inline unsigned long debug_read_clear_dr6(void)
> set_debugreg(DR6_RESERVED, 6);
> dr6 ^= DR6_RESERVED; /* Flip to positive polarity */
>
> - /*
> - * Clear the virtual DR6 value, ptrace routines will set bits here for
> - * things we want signals for.
> - */
> - current->thread.virtual_dr6 = 0;
> -
> - /*
> - * The SDM says "The processor clears the BTF flag when it
> - * generates a debug exception." Clear TIF_BLOCKSTEP to keep
> - * TIF_BLOCKSTEP in sync with the hardware BTF flag.
> - */
> - clear_thread_flag(TIF_BLOCKSTEP);
> -
> return dr6;
> }
>
> @@ -873,6 +860,20 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
> */
> WARN_ON_ONCE(user_mode(regs));
>
> + if (test_thread_flag(TIF_BLOCKSTEP)) {
> + /*
> + * The SDM says "The processor clears the BTF flag when it
> + * generates a debug exception." but PTRACE_BLOCKSTEP requested
> + * it for userspace, but we just took a kernel #DB, so re-set
> + * BTF.
> + */
> + unsigned long debugctl;
> +
> + rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> + debugctl |= DEBUGCTLMSR_BTF;
> + wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> + }
> +
> /*
> * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
> * watchpoint at the same time then that will still be handled.
> @@ -935,6 +936,26 @@ static __always_inline void exc_debug_user(struct pt_regs *regs,
> irqentry_enter_from_user_mode(regs);
> instrumentation_begin();
>
> + /*
> + * Clear the virtual DR6 value, ptrace routines will set bits here for
> + * things we want signals for.
> + */
> + current->thread.virtual_dr6 = 0;
> +
> + /*
> + * If PTRACE requested SINGLE(BLOCK)STEP, make sure to reflect that in
> + * the ptrace visible DR6 copy.
> + */
> + if (test_thread_flag(TIF_BLOCKSTEP) || test_thread_flag(TIF_SINGLESTEP))
> + current->thread.virtual_dr6 |= (dr6 & DR_STEP);

I'm guessing that this would fail a much simpler test, though: have a
program use PUSHF to set TF and then read out DR6 from the SIGTRAP. I
can whip up such a test if you like.

Is there any compelling reason not to just drop the condition and do:

current->thread.virtual_dr6 |= (dr6 & DR_STEP);

unconditionally? This DR6 cause, along with ICEBP, have the
regrettable distinctions of being the only causes that a user program
can trigger all on its own without informing the kernel first. This
means that we can't fully separate the concept of "user mode is
single-stepping itself" from "ptrace or something else is causing the
kernel to single step a program."

I bet that, without making this tweak, the virtual_dr6 change will
regress some horrific Wine use case.

--Andy