Re: [PATCH] Debug: Add cpu registers dump in case of kernel panic

From: Thomas Gleixner
Date: Thu Jul 19 2018 - 11:11:28 EST


On Mon, 9 Jul 2018, Tian, Baofeng wrote:

> From: "Tian, Baofeng" <baofeng.tian@xxxxxxxxx>
> Subject: [PATCH] Debug: Add cpu registers dump in case of kernel panic
>
> When kernel panic, the value of cpu registers are needed for analyzing, through whole ramdump to restore the crash point.
>

Please use proper line breaks at around 78 characters.

> We store CPU registers to RAM before cpu stopped, then get them with whole ram dump after warm reset.Those registers are used in analysis of dumped RAM image.
>

This is explaining what the patch does but not really why. And it does not
explain the scope of what is stored and why that would be helpful.

> Signed-off-by: Baofeng, Tian <baofeng.tian@xxxxxxxxx>
> Signed-off-by: Emmanuel Berthier <emmanuel.berthier@xxxxxxxxx>

This SOB chain is wrong. How is Emmmanuel involved here? He's even cc'ed on this.

> ---
> arch/x86/kernel/smp.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 5c574df..484831d 100644
> --- a/arch/x86/kernel/smp.c
> +++ b/arch/x86/kernel/smp.c
> @@ -117,6 +117,20 @@
> static atomic_t stopping_cpu = ATOMIC_INIT(-1); static bool smp_no_nmi_ipi = false;
>
> +static DEFINE_PER_CPU(struct pt_regs, cpu_regs);
> +
> +/* Store regs of this CPU for RAM dump decoding help */ static inline
> +void store_regs(struct pt_regs *regs) {

That's surely not kernel coding style.

> + struct pt_regs *print_regs;

print_regs ? There is nothing printed here.

> +
> + print_regs = &get_cpu_var(cpu_regs);
> + crash_setup_regs(print_regs, regs);
> +
> + /* Flush CPU cache */
> + wbinvd();
> +}
> +
> /*
> * this function sends a 'reschedule' IPI to another CPU.
> * it goes straight through and wastes no time serializing @@ -163,6 +177,7 @@ static int smp_stop_nmi_callback(unsigned int val, struct pt_regs *regs)
> if (raw_smp_processor_id() == atomic_read(&stopping_cpu))
> return NMI_HANDLED;
>
> + store_regs(regs);
> cpu_emergency_vmxoff();
> stop_this_cpu(NULL);
>
> @@ -173,9 +188,10 @@ static int smp_stop_nmi_callback(unsigned int val, struct pt_regs *regs)
> * this function calls the 'stop' function on all other CPUs in the system.
> */
>
> -asmlinkage __visible void smp_reboot_interrupt(void)
> +__visible void smp_reboot_interrupt(struct pt_regs *regs)
> {
> ipi_entering_ack_irq();
> + store_regs(regs);
> cpu_emergency_vmxoff();
> stop_this_cpu(NULL);
> irq_exit();
> @@ -247,6 +263,7 @@ static void native_stop_other_cpus(int wait)
> }
>
> finish:
> + store_regs(NULL);

How is storing regs _AFTER_ doing a lots of work helpful?

> local_irq_save(flags);
> disable_local_APIC();
> mcheck_cpu_clear(this_cpu_ptr(&cpu_info));

I'm still not seeing how all that is helpful, so I'm waiting for a new
version with proper explanations.

Thanks,

tglx