Re: Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
From: William Cohen
Date: Mon Jan 12 2015 - 14:26:36 EST
On 01/12/2015 12:30 PM, Will Deacon wrote:
> On Fri, Jan 09, 2015 at 05:13:29PM +0000, Pratyush Anand wrote:
>>
>>
>> On Friday 09 January 2015 09:16 PM, Will Deacon wrote:
>>> On Thu, Jan 08, 2015 at 05:28:37PM +0000, Pratyush Anand wrote:
>>>> On Thursday 08 January 2015 09:53 PM, Will Deacon wrote:
>>>>> On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
>>>>>> I am trying to test following scenario, which seems valid to me. But I
>>>>>> am very new to ARM64 as well as to debugging tools, so seeking expert's
>>>>>> comment here.
>>>>>>
>>>>>> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
>>>>>> which is called from elo_dbg
>>>>>> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
>>>>>>
>>>>>> -- kprobe is enabled.
>>>>>>
>>>>>> -- an uprobe is inserted into a test application and enabled.
>>>>>>
>>>>>> So, when uprobe is enabled and test code execution reaches to probe
>>>>>> instruction, it executes uprobe breakpoint instruction and el0_dbg
>>>>>> exception is raised.
>>>>>>
>>>>>> When control reaches to start of uprobe_breakpoint_handler and it
>>>>>> executes first instruction (which has been replaced with a kprobe
>>>>>> breakpoint instruction), el1_dbg exception is raised.
>>>>>
>>>>> Hmm, debug exceptions should be masked at this point so I don't see why
>>>>> you're taking the second debug exception.
>>>>>
>>>>
>>>> So, you mean to say that when an exception which has been taken from
>>>> lower exception level (EL0) is being executed, then we keep masked also
>>>> the exception from current exception level (EL1)...
>>>
>>> Yeah, if you look at entry.S then you'll see that neither el0_dbg or el1_dbg
>>> re-enable debug exceptions (masked automatically by the CPU after taking the
>>> exception) until *after* the handling has completed. This is to prevent
>>> recursive debug exceptions, which I don't see how we can reasonable handle.
>>
>> May be I am missing something, but my observation on silicon is
>> different. Please have a look at git log of HEAD of following branch,
>> which says that el1_dbg exception has been raised while el0_dbg was
>> executing. Do not know what I am missing..
>>
>> https://github.com/pratyushanand/linux/tree/ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
>
> That page just says "Failed to load latest commit information." for me.
I got that message too, but I was able to see the history and the information in the first entry of:
https://github.com/pratyushanand/linux/commits/ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
>
> Regardless, I think you need to debug further and found out if PSTATE.D is
> getting cleared and, if so, who is responsible for that. Somebody could be
> enabling IRQs, for example, which will then unmask debug exceptions in
> el1_irq.
>
> Will
>
If the problem is due to the irq being enabled and then an irq handler re-enabling the flag, it would be possible to use a systemtap script to monitor the irq_handler_entry and irq_handler_exit tracepoints to see if PSTATE.D is gettting cleared. Maybe something like the attached script. This script isn't using the kprobe support, so should avoid the problematic interactions between kprobes and uprobes.
-Will Cohen
global pstated
function masked_dflag:long(f) { return ((f & 1 << 9) != 0) }
probe irq_handler.entry {
// Record if pstate.d is masked
pstated[cpu(), irq] = masked_dflag(flags)
}
probe irq_handler.exit {
if ((!masked_dflag(flags)) && pstated[cpu(), irq]) {
printf("d flag unmasked in irq %d(%s)\n", irq, kernel_string(dev_name));
}
delete pstated[cpu(), irq]
}