Re:Re: [PATCH] arm64: traps: add dump instr before BUG in kernel

From: Chen Lin
Date: Thu Sep 30 2021 - 11:28:56 EST


At 2021-09-30 15:42:47, "Will Deacon" <will@xxxxxxxxxx> wrote:

>On Wed, Sep 29, 2021 at 09:29:46PM +0800, Chen Lin wrote:
>> From: Chen Lin <chen.lin5@xxxxxxxxxx>
>>
>> we should dump the real instructions before BUG in kernel mode, and
>> compare this to the instructions from objdump.
>>
>> Signed-off-by: Chen Lin <chen.lin5@xxxxxxxxxx>
>> ---
>> arch/arm64/kernel/traps.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
>> index b03e383..621a9dd 100644
>> --- a/arch/arm64/kernel/traps.c
>> +++ b/arch/arm64/kernel/traps.c
>> @@ -495,7 +495,12 @@ void do_undefinstr(struct pt_regs *regs)
>> if (call_undef_hook(regs) == 0)
>> return;
>>
>> - BUG_ON(!user_mode(regs));
>> + if (!user_mode(regs)) {
>> + pr_emerg("Undef instruction in kernel, dump instr:");
>> + dump_kernel_instr(KERN_EMERG, regs);
>> + BUG();
>> + }
>
>Hmm, I'm not completely convinced about this as the instruction in the
>i-cache could be completely different. I think the PC value (for addr2line)
>is a lot more useful, and we should be printing that already.
>
>Maybe you can elaborate on a situation where this information was helpful?
>
>Thanks,
>
>Will

Undef instruction occurs in some cases

1. CPU do not have the permission to execute the instruction or the current CPU
version does not support the instruction. For example, execute
'mrs x0, tcr_el3' under el1.

2. The instruction is a normal instruction, but it is changed during board
running in some abnormal situation. eg: DDR bit flip, the normal instruction
will become an undefined one. By printing the instruction, we can see the
accurate instruction code and compare it with the instruction code from objdump
to determine that it is a DDR issue.

3. It is rare that the instructions seen through the CPU are inconsistent with
the instructions in the actual DDR.You can also compare the printed instructions
with the instructions in memory(may through kdump) to determine that it is the
CPU cache or some other issue.

However, now the instruction code causing the sync 'undef instr exception' cannot be
seen. The second and third type problem above cannot be determined.

Before the commit 8a60419d36762a1 "arm64: force_signal_inject: WARN if called
from kernel context", the instructions can be printed when the CPU encounters an
undef instruction in kernel mode.