Re: x86: A process doesn't stop on hw breakpoints sometimes

From: Andy Lutomirski
Date: Mon May 23 2016 - 21:38:13 EST


On Mon, May 23, 2016 at 4:05 PM, Andrei Vagin <avagin@xxxxxxxxx> wrote:
> Hi,
>
> We use breakpoints on CRIU to stop a processes before calling
> rt_sigreturn and we found that sometimes a process runs through a
> break-point without stopping on it.
>
> https://github.com/xemul/criu/issues/162
>
>
> A small reproducer is attached. It forks a child, stops it, sets a
> breakpoint, executes a child, waits when it stops on the breakpoint. I
> execute it a few times concurrently and wait a few minutes.
>
> https://asciinema.org/a/006l3u5v82ubbkfy9fto07agd
>
> I know that it can be reproduced on:
> AMD A10 Micro-6700T
> Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
> Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz
>
> so It doesn't look like a bug in a processor.

I'm guessing you're either hitting a subtle bug in the mess that is
breakpoint handling or you're hitting a bug in perf's context switch
code.

Given that the breakpoint gets missed many times in a row, this is
presumably either a bug in breakpoint programming (i.e. the thing
isn't actually set in dr0/dr7) or a bug in the bp state tracking. If
it were a bug in RF flag handling, I'd expect it to skip once and trip
the second time through.

All that being said, I stared at the code for a while and I don't see
the bug. I can trigger this quite rarely on a VM, and it's not fun to
debug :(

--Andy