[PATCH] Add hw_breakpoint_enable to correct dr6 stuffing

From: Jeff Merkey
Date: Tue Dec 15 2015 - 14:53:31 EST


Please consider the attached patch.

SUMMARY

The current function to manually restore breakpoints
(hw_breakpoint_restore) places the contents of the thread.debugreg6
variable back into the dr6 hardware register, a piece of hardware
intended to be read, not used as local storage. The problem with this
is that systems that register for breakpoints, including the perf
event system and kgdb/kdb, modify this variable. I have observed
artificially generated stacked debug statuses being cycled through
dr6 and this variable over and over again in a loop by calls to
hw_breakpoint_restore() that cause strange behaviors in kdb and this
layer because this function allows user defined breakpoint statuses
to be stuffed back into the dr6 register.

This proposed function is necessary because upon debugger entry, you
must set dr7 to 0 to disable breakpoints while inside the debugger
then call hw_breakpoint_restore() to reload the state and re-enable
them. This function is identical to hw_breakpoint_restore() except
it leaves out writing a user defined thread.debugreg6 variable back
into a hardware register not intended for that purpose.

This behavior of stuffing articial values into dr6 can and does
cause system crashes and software misbehavior. The proposed
function improves the robustness of Linux.

CONDITIONS WHICH RESULT IN THIS SYSTEM FAULT OR ERRONEOUS STATUS

kgdb/kdb and the perf event system both present garbage status in dr6
then subsequently write this status into the thread.debugreg6 variable,
then in some cases call hw_breakpoint_restore() which writes this
status back into the dr6 hardware register.

arch/x86/kernel/kgdb.c
static void kgdb_hw_overflow_handler(struct perf_event *event,
struct perf_sample_data *data, struct pt_regs *regs)
{
struct task_struct *tsk = current;
int i;

for (i = 0; i < 4; i++)
if (breakinfo[i].enabled)
tsk->thread.debugreg6 |= (DR_TRAP0 << i);
}

I wanted to note here that this is the kgdb perf handler. What's a
serious problem here (and other perf handlers do this too) is that
kgdb has receive ONE breakpoint exception when this handler is called,
then inside this handler he sets ALL exception bits for any
user defined exceptions then sends this value to get stuffed back into
the dr6 register, so for each breakpoint that happens, it will trigger
a cascade of checks in the hw_breakpoint.c handler most hitting NULL
bp structures when this value gets read back out of dr6. If one of
them fires off and debugreg6 has gotten zapped a system crash can occur.

arch/x86/kernel/kgdb.c
static void kgdb_correct_hw_break(void)
{
... snip ...

if (!dbg_is_early)
hw_breakpoint_restore();

... snip ...
}

arch/x86/kernel/hw_breakpoint.c
void hw_breakpoint_restore(void)
{
set_debugreg(__this_cpu_read(cpu_debugreg[0]), 0);
set_debugreg(__this_cpu_read(cpu_debugreg[1]), 1);
set_debugreg(__this_cpu_read(cpu_debugreg[2]), 2);
set_debugreg(__this_cpu_read(cpu_debugreg[3]), 3);
set_debugreg(current->thread.debugreg6, 6);
set_debugreg(__this_cpu_read(cpu_dr7), 7);
}

The hardware only alters those bits that change, the rest of the altered
dr6 value remains in the register.

Upon the next int1 exception, dr6 presents this manufactured status to
the int1 handler in hw_breakpoint.c which calls the non-existent
breakpoint exceptions and any handlers which may have validly
registered, creating phantom events. If other subsystems which call
the perf handlers also have breakpoints registered, this
manufactured status causes erroneous events to be signaled to the layers
above.

arch/x86/kernel/hw_breakpoint.c
static int hw_breakpoint_handler(struct die_args *args)
{
... snip ...

/* Handle all the breakpoints that were triggered */
for (i = 0; i < HBP_NUM; ++i) {
if (likely(!(dr6 & (DR_TRAP0 << i))))
continue;

... snip ...

perf_bp_event(bp, args->regs);

... snip ...
}

After a few iterations of this cycling through the system, the
thread.debugreg6 variable starts to resemble a random number generator
as far as to which breakpoint just occurred.

So it goes like this:

INT1 exception #1 -> kgdb/perf -> set status bits for ALL
exceptions I registered (1,2,3,4) -> stuff it back into dr6 ->
INT1 exception #2 -> pickup bogus status in dr6 -> call handlers for
(1,2,3,4) -> Oops someone changed thread.debugreg6
(kgdb,perf,kdb,gdb,os, current moved) -> CRASH

TESTING AND REVIEW PERFORMED

I have reviewed all the code that touches this patch and have
determined it will function and support all of the software that
depends on this handler properly. I have compiled and tested this
patch with a test harness that tests the robustness of the linux
breakpoint API and handlers in the following ways:

1. Setting multiple conditional breakpoints through
arch_install_hw_breakpoint API across four processors to test the rate
at which the interface can handle breakpoint exceptions

2. Setting unregistered breakpoints to test the handlers robustness
in dealing with error handling conditions and errant or spurious
hardware conditions and to simulate actual "lazy debug register
switching" with null bp handlers to test the
robustness of the handlers.

3. Clearing and setting breakpoints across multiple processors then
triggering concurrent exceptions in both interrupt and process
contexts.

Signed-off-by: Jeff Merkey <linux.mdb@xxxxxxxxx>
---
arch/x86/kernel/hw_breakpoint.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 50a3fad..7ea0f78 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -425,6 +425,16 @@ void hw_breakpoint_restore(void)
}
EXPORT_SYMBOL_GPL(hw_breakpoint_restore);

+void hw_breakpoint_enable(void)
+{
+ set_debugreg(__this_cpu_read(cpu_debugreg[0]), 0);
+ set_debugreg(__this_cpu_read(cpu_debugreg[1]), 1);
+ set_debugreg(__this_cpu_read(cpu_debugreg[2]), 2);
+ set_debugreg(__this_cpu_read(cpu_debugreg[3]), 3);
+ set_debugreg(__this_cpu_read(cpu_dr7), 7);
+}
+EXPORT_SYMBOL_GPL(hw_breakpoint_enable);
+
/*
* Handle debug exception notifications.
*
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/