Re: [GIT PULL v4.5] Fix INT1 recursion with unregistered breakpoints

From: Andy Lutomirski
Date: Mon Jan 11 2016 - 20:55:03 EST


On Mon, Jan 11, 2016 at 5:30 PM, Jeff Merkey <linux.mdb@xxxxxxxxx> wrote:
> On 1/11/16, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>> On Mon, Jan 11, 2016 at 4:44 PM, Jeff Merkey <linux.mdb@xxxxxxxxx> wrote:
>>> Hi Thomas,
>>>
>>> I agree with #2, we should clear the breakpoint. As for #1, if
>>> there's an execute breakpoint it MUST be cleared or it will just fire
>>> off again when it sees the iretd from the int1 exception handler. I
>>> do use the breakpoint API Thomas, this showed up while debugging and
>>> testing the API with "lazy debug register switching".
>>>
>>> So do you want me to expand the patch and clear the breakpoint? Just
>>> give the word and I'll get busy and GIT -R- DONE.
>>
>> It seems to me that you're papering over some issue instead of fixing
>> the root cause. If you're using the API, then either you're doing it
>> wrong or the API is broken. Can you figure out which and fix it?
>>
>> --Andy
>>
>
> Andy,
>
> Linux should not crash because someone triggered a breakpoint or one
> got triggered due to a program leaving some bits lying in a read only
> register (DR6) which for some strange reason someone in the linux
> world decided could be used as local storage and to pass arguments
> between subsystems - a register intel designed to be read from for
> status. I did not design what's in that API, I have to live with
> it.

The API appears to work, though. Are you *sure* you're using it
correctly? Are you telling the code in kernel/hw_breakpoint.c about
your breakpoint?

> So all I am asking is that we fix this issue. It does not matter
> to my debugger is this is fixed or not in Linux, since I carry the fix
> in my patch, but it does matter to the overall robustness of Linux.

Robust against what, exactly? What's the bug?

I will grant that the comments about lazy dr7 switching are
mystifying, and cleaning them up might be nice. But there's no
adequate explanation of what the failure mode is, how to trigger it,
or why your patch is a reasonable fix. As it stands, you're
duplicating code.

--Andy