Re: [PATCH] x86: intel_epb: Add earlyparam option to keep bias at performance

From: Dave Hansen
Date: Mon Dec 04 2023 - 12:44:59 EST


On 12/4/23 09:28, Jack Allister wrote:
> There are certain scenarios where it may be intentional that the EPB was
> set at to 0/ENERGY_PERF_BIAS_PERFORMANCE on kernel boot. For example, in
> data centers a kexec/live-update of the kernel may be performed regularly.
>
> Usually this live-update is time critical and defaulting of the bias back
> to ENERGY_PERF_BIAS_NORMAL may actually be detrimental to the overall
> update time if processors' time to ramp up/boost are affected.
>
> This patch introduces a kernel command line "intel_epb_keep_performance"
> which will leave the EPB at performance if during the restoration code path
> it is detected as such.

Folks, while I appreciate the effort to upstream thing that you have
kept out of tree up until now, I don't think this is the right way.

In general new kernel command-line options are a last resort.

> diff --git a/arch/x86/kernel/cpu/intel_epb.c b/arch/x86/kernel/cpu/intel_epb.c
> index e4c3ba91321c..0c7dd092f723 100644
> --- a/arch/x86/kernel/cpu/intel_epb.c
> +++ b/arch/x86/kernel/cpu/intel_epb.c
> @@ -50,7 +50,8 @@
> * the OS will do that anyway. That sometimes is problematic, as it may cause
> * the system battery to drain too fast, for example, so it is better to adjust
> * it on CPU bring-up and if the initial EPB value for a given CPU is 0, the
> - * kernel changes it to 6 ('normal').
> + * kernel changes it to 6 ('normal'). This however is overridable via
> + * intel_epb_keep_performance if required.
> */
>
> static DEFINE_PER_CPU(u8, saved_epb);
> @@ -75,6 +76,8 @@ static u8 energ_perf_values[] = {
> [EPB_INDEX_POWERSAVE] = ENERGY_PERF_BIAS_POWERSAVE,
> };
>
> +static bool intel_epb_keep_performance __read_mostly;
> +
> static int intel_epb_save(void)
> {
> u64 epb;
> @@ -107,8 +110,12 @@ static void intel_epb_restore(void)
> */
> val = epb & EPB_MASK;
> if (val == ENERGY_PERF_BIAS_PERFORMANCE) {
> - val = energ_perf_values[EPB_INDEX_NORMAL];
> - pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n");
> + if (!intel_epb_keep_performance) {
> + val = energ_perf_values[EPB_INDEX_NORMAL];
> + pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n");
> + } else {
> + pr_warn_once("ENERGY_PERF_BIAS: Kept at 'performance', no change\n");
> + }
> }

This is fundamentally a hack.

It sounds like you want the system default to be at
ENERGY_PERF_BIAS_PERFORMANCE. You also mentioned that this was done "on
kernel boot". How did you do that, exactly? Shouldn't that "on kernel
boot" action be reflected over here rather than introducing another
command-line parameter?