[PATCH v2 05/11] stop_machine: Introduce stop_machine_nmi_cpuslocked()
From: Chang S. Bae
Date: Mon Mar 30 2026 - 22:16:15 EST
With the NMI control logic in place, introduce an API to run the target
function from NMI context.
Originally-by: David Kaplan <david.kaplan@xxxxxxx>
Suggested-by: Borislav Petkov <bp@xxxxxxxxx>
Signed-off-by: Chang S. Bae <chang.seok.bae@xxxxxxxxx>
Link: https://lore.kernel.org/lkml/20260202105411.GVaYCCUygtEUNrMUtG@fat_crate.local
---
V1 -> V2:
* Support nmi_cpus mask (Boris), including @cpus=NULL cases
* Split out API introduction
* Drop out stop_machine_nmi() [**]
[**] could be added but no user yet in this series
---
include/linux/stop_machine.h | 24 ++++++++++++++++++++
kernel/stop_machine.c | 43 ++++++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+)
diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index 9424d363ab38..2da9aa0ec3d3 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -201,6 +201,30 @@ stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data,
void arch_send_self_nmi(void);
bool noinstr stop_machine_nmi_handler(void);
+/**
+ * stop_machine_nmi_cpuslocked() - Freeze CPUs and run a function in NMI context
+ *
+ * @nmisafe_fn: The function to run
+ * @data: The data pointer for @nmisafe_fn()
+ * @cpus: A cpumask containing the CPUs to run @nmisafe_fn() on. If NULL,
+ * @nmisafe_fn() runs on a single (arbitrary) CPU from
+ * cpu_online_mask.
+ *
+ * Description: This stop_machine() variant runs @nmisafe_fn() from NMI context
+ * to prevent preemption by other NMIs. The callback must be built with noinstr.
+ * Other than that, the semantics match stop_machine_cpuslocked().
+ *
+ * Context: Must be called from within a cpus_read_lock() protected region.
+ * Avoid nested calls to cpus_read_lock().
+ *
+ * Return: 0 if all invocations of @nmisafe_fn return zero, -ENOMEM if cpumask
+ * allocation fails, -EINVAL if any target CPU failed to receive NMI. Otherwise,
+ * an accumulated return value from all invocation of @nmisafe_fn that returned
+ * non-zero.
+ */
+int stop_machine_nmi_cpuslocked(cpu_stop_nmisafe_fn_t nmisafe_fn, void *data,
+ const struct cpumask *cpus);
+
#else
static inline bool stop_machine_nmi_handler(void) { return false; }
#endif /* CONFIG_STOP_MACHINE_NMI */
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 45ea62f1b2b5..e20e4d3e7b16 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -798,6 +798,49 @@ static int multi_stop_run(struct multi_stop_data *msdata)
return msdata->use_nmi ? nmi_stop_run(msdata) : msdata->fn(msdata->data);
}
+int stop_machine_nmi_cpuslocked(cpu_stop_nmisafe_fn_t nmisafe_fn, void *data,
+ const struct cpumask *cpus)
+{
+ struct multi_stop_data msdata = {
+ .nmisafe_fn = nmisafe_fn,
+ .data = data,
+ .num_threads = num_online_cpus(),
+ .active_cpus = cpus,
+ .use_nmi = true,
+ };
+ int ret;
+
+ if (!zalloc_cpumask_var(&msdata.nmi_cpus, GFP_KERNEL))
+ return -ENOMEM;
+
+ /*
+ * NMI CPUs should be exactly those 'active' CPUs executing the
+ * stop function. Follow the selection logic in multi_cpu_stop()
+ * if not provided.
+ */
+ if (!msdata.active_cpus)
+ cpumask_set_cpu(cpumask_first(cpu_online_mask), msdata.nmi_cpus);
+ else
+ cpumask_copy(msdata.nmi_cpus, msdata.active_cpus);
+
+ lockdep_assert_cpus_held();
+
+ ret = stop_multi_cpus(&msdata);
+
+ /*
+ * The NMI handler clears each CPU bit. If any of those NMIs were
+ * ever missed out, return error clearly.
+ */
+ if (!cpumask_empty(msdata.nmi_cpus)) {
+ pr_err("CPUs %*pbl didn't run the stop_machine NMI handler.\n",
+ cpumask_pr_args(msdata.nmi_cpus));
+ ret = -EINVAL;
+ }
+
+ free_cpumask_var(msdata.nmi_cpus);
+ return ret;
+}
+
#else
static int multi_stop_run(struct multi_stop_data *msdata)
--
2.51.0