Re: [GIT PULL v4.5] Fix INT1 recursion with unregistered breakpoints
From: Jeff Merkey
Date: Mon Jan 11 2016 - 21:26:59 EST
On 1/11/16, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Mon, Jan 11, 2016 at 6:07 PM, Jeff Merkey <linux.mdb@xxxxxxxxx> wrote:
>> On 1/11/16, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>> On Mon, Jan 11, 2016 at 5:30 PM, Jeff Merkey <linux.mdb@xxxxxxxxx>
>>> wrote:
>>>> On 1/11/16, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>>>> On Mon, Jan 11, 2016 at 4:44 PM, Jeff Merkey <linux.mdb@xxxxxxxxx>
>>>>> wrote:
>>>>>> Hi Thomas,
>>>>>>
>>>>>> I agree with #2, we should clear the breakpoint. As for #1, if
>>>>>> there's an execute breakpoint it MUST be cleared or it will just fire
>>>>>> off again when it sees the iretd from the int1 exception handler. I
>>>>>> do use the breakpoint API Thomas, this showed up while debugging and
>>>>>> testing the API with "lazy debug register switching".
>>>>>>
>>>>>> So do you want me to expand the patch and clear the breakpoint? Just
>>>>>> give the word and I'll get busy and GIT -R- DONE.
>>>>>
>>>>> It seems to me that you're papering over some issue instead of fixing
>>>>> the root cause. If you're using the API, then either you're doing it
>>>>> wrong or the API is broken. Can you figure out which and fix it?
>>>>>
>>>>> --Andy
>>>>>
>>>>
>>>> Andy,
>>>>
>>>> Linux should not crash because someone triggered a breakpoint or one
>>>> got triggered due to a program leaving some bits lying in a read only
>>>> register (DR6) which for some strange reason someone in the linux
>>>> world decided could be used as local storage and to pass arguments
>>>> between subsystems - a register intel designed to be read from for
>>>> status. I did not design what's in that API, I have to live with
>>>> it.
>>>
>>> The API appears to work, though. Are you *sure* you're using it
>>> correctly? Are you telling the code in kernel/hw_breakpoint.c about
>>> your breakpoint?
>>>
>>>> So all I am asking is that we fix this issue. It does not matter
>>>> to my debugger is this is fixed or not in Linux, since I carry the fix
>>>> in my patch, but it does matter to the overall robustness of Linux.
>>>
>>> Robust against what, exactly? What's the bug?
>>>
>>> I will grant that the comments about lazy dr7 switching are
>>> mystifying, and cleaning them up might be nice. But there's no
>>> adequate explanation of what the failure mode is, how to trigger it,
>>> or why your patch is a reasonable fix. As it stands, you're
>>> duplicating code.
>>>
>>> --Andy
>>
>> Andy,
>>
>> Couple of things:
>>
>> Would you like a copy of the test harness that creates this bug to
>> test for yourself? I previously posted it on the list. If you don't
>> have it, I'll provide it.
>
> If you can send a short, buildable thing that triggers it, I'll read it.
>
>>
>> Since the dr6 bits get shifted around, it doesn't matter if the
>> breakpoint was registered or not in the API because the broken handler
>> will call NULL bp structures and crash whether its registered or not.
>>
>
> And what exactly does this have to do with anything? Your patch is
> all about spurious breakpoints triggered by dr7 and should have
> nothing much to do with the value in dr6. Unless dr6 is missing a bit
> due to some issue, but you never suggested any problem like that.
>
It's about setting the resume flag when an execute breakpoint occurs, no matter
what caused the breakpoint. If is not set, the system will hang with
that processor
hung on the same execution address. You cannot have an int1 exception path
that does not set the resume flag which is the case here -- there
should be no path
where this flag does not get set on an execute breakpoint.
>> You keep asking the same questions and the answers are the the writeup
>> for the patch, including how it is triggered, what triggers it, how I
>> have triggered it, etc. and you are simply ignoring what's written
>> there (or you have not read it). It makes me wonder if you really
>> know and understand x86 debugger stuff.
>>
>
> No, your writeup is long, hard to read, and doesn't address how any
> actual problem exists.
>
> --Andy
>
I'm sorry Andy, but you are just flat wrong.
Jeff