Re: [PATCH v3 3/4] cpufreq: ACPI: Use IPI to update boost MSR in cpufreq_boost_down_prep()
From: Zhongqiu Han
Date: Thu Jun 25 2026 - 11:50:29 EST
On 6/24/2026 9:35 PM, Rafael J. Wysocki wrote:
On Wed, Jun 24, 2026 at 1:26 PM Zhongqiu Han
<zhongqiu.han@xxxxxxxxxxxxxxxx> wrote:
On 6/23/2026 8:47 PM, Rafael J. Wysocki wrote:
On Fri, Jun 19, 2026 at 12:45 AM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
During driver exit or CPU hotplug, acpi_cpufreq_cpu_exit() calls
cpufreq_boost_down_prep() to re-enable boost on the target CPU. However,
cpufreq_boost_down_prep() ignores the target CPU parameter and calls
boost_set_msr(1) locally. Since this runs in an unbound process context, it
updates the local CPU's MSR while leaving the target CPU's boost state
unchanged.
On Intel platforms, the open-coded read-modify-write of
MSR_IA32_MISC_ENABLE in boost_set_msr() is vulnerable to preemption and
thread migration if executed in process context.
Fix both issues by routing the MSR update through
smp_call_function_single() to execute boost_set_msr() synchronously on the
target CPU. This ensures the correct CPU is updated and that the RMW
sequence on Intel executes safely in IPI context with interrupts disabled.
Fixes: a3605c46e0c0 ("cpufreq: acpi-cpufreq: drop rdmsr_on_cpus() usage")
Reported-by: Sashiko <sashiko-bot@xxxxxxxxxx>
Closes: https://sashiko.dev/#/patchset/20260612215729.1532175-1-jmattson%40google.com
Assisted-by: Gemini:gemini-3.5-pro
Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx>
---
drivers/cpufreq/acpi-cpufreq.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 21639d9ac753..48d678812fd6 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -528,10 +528,11 @@ static void free_acpi_perf_data(void)
static int cpufreq_boost_down_prep(unsigned int cpu)
{
/*
- * Clear the boost-disable bit on the CPU_DOWN path so that
- * this cpu cannot block the remaining ones from boosting.
+ * Clear the boost-disable bit on the target CPU so that
+ * it cannot block the remaining ones from boosting.
*/
- return boost_set_msr(1);
+ return smp_call_function_single(cpu, boost_set_msr_each,
+ (void *)1L, 1);
While I have not investigated the history of this code, I'm wondering
why cpufreq_boost_down_prep() is a separate function and why it takes
the CPU number argument.
The only caller of it is acpi_cpufreq_cpu_exit() which could just do
set_boost(policy, 1);
instead of calling it (and might check if boost was supported at all
for that matter).
1) It seems that set_boost(policy, 1) would be a no-op here, since
policy->cpus appears to be empty by the time ->exit() is called.
All right, but this means that ->exit() is not quite suitable for this
type of cleanup.
acpi-cpufreq only implements ->exit (no ->offline), and ->exit() is
reached only when policy_is_inactive() (i.e. cpumask_empty
(policy->cpus)) is true. set_boost() does on_each_cpu_mask(policy->cpus,
...), which on an empty mask runs nothing.
Arguably though, .offline() should take care of the boost cleanup for
the CPU going offline. .exit() is too late because it can't clean up
CPUs that went offline before.
2) The patch's restore on only policy->cpu
(smp_call_function_single(policy->cpu, ...) will not fully cover shared
policies, and how much it covers seems to depend on the boost-disable
bit's scope (but that seems to be already discussed in the cover letter
and is probably a separate topic).
Please feel free to correct me if I'm misunderstanding the scope of the
boost-disable bit:
On driver unload the policy CPUs are removed from policy->cpus but seem
to stay hardware-online, so a stale disable bit on a non-last CPU might
still matter (those CPUs are presumably still running and may want to
boost). Abstractly:
- if the bit is per-CPU independent: clearing policy->cpu seems to fix
only that CPU, and the other still-online CPUs in the policy may stay
disabled.
Yes.
- if the bit is shared within a group (e.g. core/package), "disabled if
any member set": clearing policy->cpu maybe only covers its own group,
and other groups spanned by the policy could stay disabled.
Yes.
Thus it is better to clear it at offline time for the given CPU.
Hi Rafael,
Thanks for the clarification.
I agree. Just for discussion, two possible approaches come to mind:
Option 1 (structurally clean):
Introduce an optional per-CPU cpufreq_driver->set_boost_cpu(cpu, enable)
hook and call it from __cpufreq_offline() before cpumask_clear_cpu(),
but only on the unbind path (not on plain CPU hotplug-offline - clearing
it there maybe wrongly re-enable boost on a still-online sibling on
shared-scope HW). Roughly (rough sketch only):
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 21639d9ac753..5f298fbb0d8c 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -525,13 +525,10 @@ static void free_acpi_perf_data(void)
free_percpu(acpi_perf_data);
}
-static int cpufreq_boost_down_prep(unsigned int cpu)
+static void acpi_cpufreq_set_boost_cpu(unsigned int cpu, bool enable)
{
- /*
- * Clear the boost-disable bit on the CPU_DOWN path so that
- * this cpu cannot block the remaining ones from boosting.
- */
- return boost_set_msr(1);
+ smp_call_function_single(cpu, boost_set_msr_each,
+ (void *)(long)enable, 1);
}
/*
@@ -951,7 +948,6 @@ static void acpi_cpufreq_cpu_exit(struct cpufreq_policy *policy)
pr_debug("%s\n", __func__);
- cpufreq_boost_down_prep(policy->cpu);
policy->fast_switch_possible = false;
policy->driver_data = NULL;
acpi_processor_unregister_performance(data->acpi_perf_cpu);
@@ -986,6 +982,7 @@ static struct cpufreq_driver acpi_cpufreq_driver = {
.bios_limit = acpi_processor_get_bios_limit,
.init = acpi_cpufreq_cpu_init,
.exit = acpi_cpufreq_cpu_exit,
+ .set_boost_cpu = acpi_cpufreq_set_boost_cpu,
.resume = acpi_cpufreq_resume,
.name = "acpi-cpufreq",
.attr = acpi_cpufreq_attr,
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 507224c9ecd3..c4952ca3e8a5 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1669,13 +1669,18 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
return 0;
}
-static void __cpufreq_offline(unsigned int cpu, struct cpufreq_policy *policy)
+static void __cpufreq_offline(unsigned int cpu, struct cpufreq_policy *policy,
+ bool unbinding)
{
int ret;
if (has_target())
cpufreq_stop_governor(policy);
+ if (unbinding && cpufreq_driver->set_boost_cpu &&
+ policy->boost_supported)
+ cpufreq_driver->set_boost_cpu(cpu, true);
+
cpumask_clear_cpu(cpu, policy->cpus);
if (!policy_is_inactive(policy)) {
@@ -1730,7 +1735,7 @@ static int cpufreq_offline(unsigned int cpu)
guard(cpufreq_policy_write)(policy);
- __cpufreq_offline(cpu, policy);
+ __cpufreq_offline(cpu, policy, false);
return 0;
}
@@ -1750,7 +1755,7 @@ static void cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
scoped_guard(cpufreq_policy_write, policy) {
if (cpu_online(cpu))
- __cpufreq_offline(cpu, policy);
+ __cpufreq_offline(cpu, policy, true);
remove_cpu_dev_symlink(policy, cpu, dev);
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index ae9d1ce4f49c..c174b013dd76 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -413,6 +413,8 @@ struct cpufreq_driver {
int (*online)(struct cpufreq_policy *policy);
int (*offline)(struct cpufreq_policy *policy);
void (*exit)(struct cpufreq_policy *policy);
+ /* Optional */
+ void (*set_boost_cpu)(unsigned int cpu, bool enable);
int (*suspend)(struct cpufreq_policy *policy);
int (*resume)(struct cpufreq_policy *policy);
Option 2 (lighter, but it is at odds with the direction discussed
above): stay in acpi-cpufreq's ->exit() and restore over
policy->related_cpus instead.
--
Thx and BRs,
Zhongqiu Han