Re: [RFC] panic: reduce CPU consumption when finished handling panic
From: Carlos Bilbao
Date: Wed Mar 26 2025 - 11:15:13 EST
Hello,
On 3/21/25 14:03, Thomas Gleixner wrote:
> On Mon, Mar 17 2025 at 17:01, Carlos Bilbao wrote:
>> After the kernel has finished handling a panic, it enters a busy-wait loop.
>> But, this unnecessarily consumes CPU power and electricity. Plus, in VMs,
>> this negatively impacts the throughput of other VM guests running on the
>> same hypervisor.
>>
>> I propose introducing a function cpu_halt_end_panic() to halt the CPU
>> during this state while still allowing interrupts to be processed. See my
>> commit below.
> That's not the way how change logs are written. You explain the problem
> and then briefly how it is addressed.
>
> No proposal, no 'see below'. Such wording does not make any sense in a
> git commit. See Documentation/process/
>
>> @@ -276,6 +276,21 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
>> crash_smp_send_stop();
> Your patch is malformed due to white space damage:
>
> patch: **** malformed patch at line 31: crash_smp_send_stop();
>
>> +static void cpu_halt_end_panic(void)
>> +{
>> +#ifdef CONFIG_X86
>> + native_safe_halt();
>> +#elif defined(CONFIG_ARM)
>> + cpu_do_idle();
>> +#else
>> + /*
>> + * Default to a simple busy-wait if no architecture-specific halt is
>> + * defined above
>> + */
>> + mdelay(PANIC_TIMER_STEP);
>> +#endif
> Architecture specific #ifdefs in core code are not the right way to
> go. Split this into three patches:
>
> 1) Add a weak fallback function
>
> void __weak cpu_halt_after_panic(void)
> {
> mdelay(PANIC_TIMER_STEP);
> }
>
> 2) Add non weak implementation in arch/x86
>
> native_safe_halt() is wrong vs. paravirtualization.
>
> safe_halt() is what you want.
>
> 3) Add non weak implementation for arch/arm
>
> Not my playground, so no comment
Sounds good, sending patch set now. Thanks for your time!
>
> Thanks,
>
> tglx
>
>
Thanks,
Carlos