Reminder: [PATCH 0/2] Reduce CPU usage when finished handling panic

From: Carlos Bilbao
Date: Thu Apr 10 2025 - 13:30:46 EST


Hello again,

I would really appreciate your opinions on this.

Thanks!
Carlos

On 3/26/25 10:12, carlos.bilbao@xxxxxxxxxx wrote:
> From: Carlos Bilbao <cbilbao@xxxxxxxxxxxxxxxx>
>
> After the kernel has finished handling a panic, it enters a busy-wait loop.
> But, this unnecessarily consumes CPU power and electricity. Plus, in VMs,
> this negatively impacts the throughput of other VM guests running on the
> same hypervisor.
>
> This patch set introduces a weak function cpu_halt_after_panic() to give
> architectures the option to halt the CPU during this state while still
> allowing interrupts to be processed. Do so for arch/x86 by defining the
> weak function and calling safe_halt().
>
> Here's some numbers to support my claim, the perf stats from the hypervisor
> after triggering a panic on a guest Linux kernel.
>
> Samples: 55K of event 'cycles:P', Event count (approx.): 36090772574
> Overhead Command Shared Object Symbol
> 42.20% CPU 5/KVM [kernel.kallsyms] [k] vmx_vmexit
> 19.07% CPU 5/KVM [kernel.kallsyms] [k] vmx_spec_ctrl_restore_host
> 9.73% CPU 5/KVM [kernel.kallsyms] [k] vmx_vcpu_enter_exit
> 3.60% CPU 5/KVM [kernel.kallsyms] [k] __flush_smp_call_function_queue
> 2.91% CPU 5/KVM [kernel.kallsyms] [k] vmx_vcpu_run
> 2.85% CPU 5/KVM [kernel.kallsyms] [k] native_irq_return_iret
> 2.67% CPU 5/KVM [kernel.kallsyms] [k] native_flush_tlb_one_user
> 2.16% CPU 5/KVM [kernel.kallsyms] [k] llist_reverse_order
> 2.10% CPU 5/KVM [kernel.kallsyms] [k] __srcu_read_lock
> 2.08% CPU 5/KVM [kernel.kallsyms] [k] flush_tlb_func
> 1.52% CPU 5/KVM [kernel.kallsyms] [k] vcpu_enter_guest.constprop.0
>
> And here are the results from the guest VM after applying my patch:
>
> Samples: 51 of event 'cycles:P', Event count (approx.): 37553709
> Overhead Command Shared Object Symbol
> 7.94% qemu-system-x86 [kernel.kallsyms] [k] __schedule
> 7.94% qemu-system-x86 libc.so.6 [.] 0x00000000000a2702
> 7.94% qemu-system-x86 qemu-system-x86_64 [.] 0x000000000057603c
> 7.43% qemu-system-x86 libc.so.6 [.] malloc
> 7.43% qemu-system-x86 libc.so.6 [.] 0x00000000001af9c0
> 6.37% IO mon_iothread libglib-2.0.so.0.7200.4 [.] g_mutex_unlock
> 5.21% IO mon_iothread [kernel.kallsyms] [k] __pollwait
> 4.70% IO mon_iothread [kernel.kallsyms] [k] clear_bhb_loop
> 3.56% IO mon_iothread [kernel.kallsyms] [k] __secure_computing
> 3.56% IO mon_iothread libglib-2.0.so.0.7200.4 [.] g_main_context_query
> 3.15% IO mon_iothread [kernel.kallsyms] [k] __hrtimer_start_range_ns
> 3.15% IO mon_iothread [kernel.kallsyms] [k] _raw_spin_lock_irq
> 2.88% IO mon_iothread libglib-2.0.so.0.7200.4 [.] g_main_context_prepare
> 2.83% qemu-system-x86 libglib-2.0.so.0.7200.4 [.] g_slist_foreach
> 2.58% IO mon_iothread qemu-system-x86_64 [.] 0x00000000004e820d
> 2.21% qemu-system-x86 libc.so.6 [.] 0x0000000000088010
> 1.94% IO mon_iothread [kernel.kallsyms] [k] arch_exit_to_user_mode_prepar
>
> As you can see, CPU consumption is significantly reduced after applying the
> proposed change after panic logic, with KVM-related functions (e.g.,
> vmx_vmexit()) dropping from more than 70% of CPU usage to virtually
> nothing. Also, the num of samples decreased from 55K to 51 and the event
> count dropped from 36.09 billion to 37.55 million.
>
> Carlos Bilbao at DigitalOcean (2):
> panic: Allow archs to reduce CPU consumption after panic
> x86/panic: Use safe_halt() for CPU halt after panic
>
> ---
>
> arch/x86/kernel/Makefile | 1 +
> arch/x86/kernel/panic.c | 9 +++++++++
> kernel/panic.c | 12 +++++++++++-
> 3 files changed, 21 insertions(+), 1 deletion(-)
> create mode 100644 arch/x86/kernel/panic.c
>
>
> From mboxrd@z Thu Jan 1 00:00:00 1970
> Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201])
> (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
> (No client certificate requested)
> by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B42E1F4174
> for <linux-kernel@xxxxxxxxxxxxxxx>; Wed, 26 Mar 2025 15:12:15 +0000 (UTC)
> Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201
> ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
> t=1743001935; cv=none; b=pTx5wAwLeH5sWAAgsmlCk1lZgzSyUJH4X0UwzbEXvNm3EDKfoAwmJNvbIAk6ESdDQZ4j/9u/Tr51T9mIAGBteoeogjNzS7CEhokwMvfjLwfK/GZHSzyN+0oqtMptT829NvzENA2BVex1DLKAjsePN5nTlrf3/WMHr1bcmQSYBG4=
> ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
> s=arc-20240116; t=1743001935; c=relaxed/simple;
> bh=dvH6cZROBDqL/EIJB0ddluLgh3GMP5pgXEtD5g291tI=;
> h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
> MIME-Version; b=oQxBjv4Hpv/rJEVcoN/5DAetBXYAQcfNM++r5iZT8phmtHiLu/rCJ3KAEuqzqy6ffyuEgAPLj8oD9G5nwxUFtscLmkYOL1LlhmcNF5Qtdfnmpdbtsd6oCsCd+9eV0diUhXdALWysZAF6aQXpSZw5LUfT8xresIHTaKWrp6pvX7A=
> ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MurPLbRg; arc=none smtp.client-ip=10.30.226.201
> Authentication-Results: smtp.subspace.kernel.org;
> dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MurPLbRg"
> Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4D6FDC4CEE8;
> Wed, 26 Mar 2025 15:12:14 +0000 (UTC)
> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org;
> s=k20201202; t=1743001935;
> bh=dvH6cZROBDqL/EIJB0ddluLgh3GMP5pgXEtD5g291tI=;
> h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
> b=MurPLbRgOhvxG7DGoeI4e6uf1uBNgQKuYEVn+R9J1Ys/ntkU8s2GjleUUf4P5gSje
> K0Qw27qmTj6yClEmUiZYU3Jw1dUraF20/y3Y5X2ULu4JIBKzJDJcs5zPefI7Hkzoie
> vbSpNhTmjCjRrUQu0tIv9GAwFTQynj6olDOMx+wonf4CXVF2xg0OSv6n4KuZs9Plps
> V14SmWmJUQLArdVDliLtaFaZ+VR12eQgLTD7XuLG8HExBuGdATgUYre2U3B9lGEfxr
> RcHi7NoRsrkmWAkQfXjInPNCwOkLvWM6CaaRHxsMWSD8aK5/8DS82WxDealKGyUyX0
> LuAEXKNekpppw==
> From: carlos.bilbao@xxxxxxxxxx
> To: tglx@xxxxxxxxxxxxx
> Cc: bilbao@xxxxxx,
> pmladek@xxxxxxxx,
> akpm@xxxxxxxxxxxxxxxxxxxx,
> jan.glauber@xxxxxxxxx,
> jani.nikula@xxxxxxxxx,
> linux-kernel@xxxxxxxxxxxxxxx,
> gregkh@xxxxxxxxxxxxxxxxxxx,
> takakura@xxxxxxxxxxxxx,
> john.ogness@xxxxxxxxxxxxx,
> Carlos Bilbao <carlos.bilbao@xxxxxxxxxx>
> Subject: [PATCH 1/2] panic: Allow archs to reduce CPU consumption after panic
> Date: Wed, 26 Mar 2025 10:12:03 -0500
> Message-ID: <20250326151204.67898-2-carlos.bilbao@xxxxxxxxxx>
> X-Mailer: git-send-email 2.47.1
> In-Reply-To: <20250326151204.67898-1-carlos.bilbao@xxxxxxxxxx>
> References: <20250326151204.67898-1-carlos.bilbao@xxxxxxxxxx>
> Precedence: bulk
> X-Mailing-List: linux-kernel@xxxxxxxxxxxxxxx
> List-Id: <linux-kernel.vger.kernel.org>
> List-Subscribe: <mailto:linux-kernel+subscribe@xxxxxxxxxxxxxxx>
> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@xxxxxxxxxxxxxxx>
> MIME-Version: 1.0
> Content-Transfer-Encoding: 8bit
>
> From: Carlos Bilbao <carlos.bilbao@xxxxxxxxxx>
>
> After handling a panic, the kernel enters a busy-wait loop, unnecessarily
> consuming CPU and potentially impacting other workloads including other
> guest VMs in the case of virtualized setups.
>
> Introduce cpu_halt_after_panic(), a weak function that archs can override
> for a more efficient halt of CPU work. By default, it preserves the
> pre-existing behavior of delay.
>
> Signed-off-by: Carlos Bilbao (DigitalOcean) <carlos.bilbao@xxxxxxxxxx>
> Reviewed-by: Jan Glauber (DigitalOcean) <jan.glauber@xxxxxxxxx>
> ---
> kernel/panic.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/panic.c b/kernel/panic.c
> index fbc59b3b64d0..fafe3fa22533 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -276,6 +276,16 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
> crash_smp_send_stop();
> }
>
> +/*
> + * Called after a kernel panic has been handled, at which stage halting
> + * the CPU can help reduce unnecessary CPU consumption. In the absence of
> + * arch-specific implementations, just delay
> + */
> +static void __weak cpu_halt_after_panic(void)
> +{
> + mdelay(PANIC_TIMER_STEP);
> +}
> +
> /**
> * panic - halt the system
> * @fmt: The text string to print
> @@ -474,7 +484,7 @@ void panic(const char *fmt, ...)
> i += panic_blink(state ^= 1);
> i_next = i + 3600 / PANIC_BLINK_SPD;
> }
> - mdelay(PANIC_TIMER_STEP);
> + cpu_halt_after_panic();
> }
> }
>