Re: [PATCH v3 3/4] cpufreq: ACPI: Use IPI to update boost MSR in cpufreq_boost_down_prep()
From: Rafael J. Wysocki
Date: Wed Jun 24 2026 - 09:37:19 EST
On Wed, Jun 24, 2026 at 1:26 PM Zhongqiu Han
<zhongqiu.han@xxxxxxxxxxxxxxxx> wrote:
>
> On 6/23/2026 8:47 PM, Rafael J. Wysocki wrote:
> > On Fri, Jun 19, 2026 at 12:45 AM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
> >>
> >> During driver exit or CPU hotplug, acpi_cpufreq_cpu_exit() calls
> >> cpufreq_boost_down_prep() to re-enable boost on the target CPU. However,
> >> cpufreq_boost_down_prep() ignores the target CPU parameter and calls
> >> boost_set_msr(1) locally. Since this runs in an unbound process context, it
> >> updates the local CPU's MSR while leaving the target CPU's boost state
> >> unchanged.
> >>
> >> On Intel platforms, the open-coded read-modify-write of
> >> MSR_IA32_MISC_ENABLE in boost_set_msr() is vulnerable to preemption and
> >> thread migration if executed in process context.
> >>
> >> Fix both issues by routing the MSR update through
> >> smp_call_function_single() to execute boost_set_msr() synchronously on the
> >> target CPU. This ensures the correct CPU is updated and that the RMW
> >> sequence on Intel executes safely in IPI context with interrupts disabled.
> >>
> >> Fixes: a3605c46e0c0 ("cpufreq: acpi-cpufreq: drop rdmsr_on_cpus() usage")
> >> Reported-by: Sashiko <sashiko-bot@xxxxxxxxxx>
> >> Closes: https://sashiko.dev/#/patchset/20260612215729.1532175-1-jmattson%40google.com
> >> Assisted-by: Gemini:gemini-3.5-pro
> >> Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx>
> >> ---
> >> drivers/cpufreq/acpi-cpufreq.c | 7 ++++---
> >> 1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
> >> index 21639d9ac753..48d678812fd6 100644
> >> --- a/drivers/cpufreq/acpi-cpufreq.c
> >> +++ b/drivers/cpufreq/acpi-cpufreq.c
> >> @@ -528,10 +528,11 @@ static void free_acpi_perf_data(void)
> >> static int cpufreq_boost_down_prep(unsigned int cpu)
> >> {
> >> /*
> >> - * Clear the boost-disable bit on the CPU_DOWN path so that
> >> - * this cpu cannot block the remaining ones from boosting.
> >> + * Clear the boost-disable bit on the target CPU so that
> >> + * it cannot block the remaining ones from boosting.
> >> */
> >> - return boost_set_msr(1);
> >> + return smp_call_function_single(cpu, boost_set_msr_each,
> >> + (void *)1L, 1);
> >
> > While I have not investigated the history of this code, I'm wondering
> > why cpufreq_boost_down_prep() is a separate function and why it takes
> > the CPU number argument.
> >
> > The only caller of it is acpi_cpufreq_cpu_exit() which could just do
> >
> > set_boost(policy, 1);
> >
> > instead of calling it (and might check if boost was supported at all
> > for that matter).
>
>
> 1) It seems that set_boost(policy, 1) would be a no-op here, since
> policy->cpus appears to be empty by the time ->exit() is called.
All right, but this means that ->exit() is not quite suitable for this
type of cleanup.
> acpi-cpufreq only implements ->exit (no ->offline), and ->exit() is
> reached only when policy_is_inactive() (i.e. cpumask_empty
> (policy->cpus)) is true. set_boost() does on_each_cpu_mask(policy->cpus,
> ...), which on an empty mask runs nothing.
Arguably though, .offline() should take care of the boost cleanup for
the CPU going offline. .exit() is too late because it can't clean up
CPUs that went offline before.
> 2) The patch's restore on only policy->cpu
> (smp_call_function_single(policy->cpu, ...) will not fully cover shared
> policies, and how much it covers seems to depend on the boost-disable
> bit's scope (but that seems to be already discussed in the cover letter
> and is probably a separate topic).
>
> Please feel free to correct me if I'm misunderstanding the scope of the
> boost-disable bit:
>
> On driver unload the policy CPUs are removed from policy->cpus but seem
> to stay hardware-online, so a stale disable bit on a non-last CPU might
> still matter (those CPUs are presumably still running and may want to
> boost). Abstractly:
>
> - if the bit is per-CPU independent: clearing policy->cpu seems to fix
> only that CPU, and the other still-online CPUs in the policy may stay
> disabled.
Yes.
> - if the bit is shared within a group (e.g. core/package), "disabled if
> any member set": clearing policy->cpu maybe only covers its own group,
> and other groups spanned by the policy could stay disabled.
Yes.
Thus it is better to clear it at offline time for the given CPU.