Re: [PATCH 1/7] stop_machine: Introduce stop_machine_nmi()

From: Chang S. Bae

Date: Thu Jan 29 2026 - 10:48:20 EST


On 1/29/2026 4:17 AM, Borislav Petkov wrote:
On Wed, Jan 28, 2026 at 05:35:56PM +0100, Borislav Petkov wrote:>
-
-static DEFINE_STATIC_KEY_FALSE(stop_machine_nmi_handler_enable);
-
-bool stop_machine_nmi_handler_enabled(void)
-{
- return static_branch_unlikely(&stop_machine_nmi_handler_enable);
-}
Thanks for the detailed write-up! I just wanted to clarify a few points, if you don't mind:

First, on patch6, it is going to be used by microcode_offline_nmi_handler()
{
if (!raw_cpu_read(ucode_ctrl.nmi_enabled))
return;
raw_cpu_write(ucode_ctrl.nmi_enabled, false);
raw_cpu_write(ucode_ctrl.result, UCODE_OFFLINE);
raw_atomic_inc(&offline_in_nmi);
wait_for_ctrl();
}

I assume you consider the per-CPU nmi_enabled bit is enough going there. Then,

DEFINE_IDTENTRY_RAW(exc_nmi)
{
...
if (arch_cpu_is_offline(smp_processor_id())) {
microcode_offline_nmi_handler();
return;
}
...
}

Correct?

The other bit is possible bug after the removal. The NMI path effectively becomes:

static noinstr void default_do_nmi(struct pt_regs *regs)
...
+ if (stop_machine_nmi_handler())
+ goto out;

which means it could be invoked from an NMI unrelated to stop-machine.

bool noinstr stop_machine_nmi_handler(void)
{
- struct multi_stop_data *msdata;
+ struct multi_stop_data *msdata = raw_cpu_read(stop_machine_nmi_ctrl.msdata);

In that case, msdata could still be NULL here.

int err;
- if (!raw_cpu_read(stop_machine_nmi_ctrl.nmi_enabled))
+ if (!cpumask_test_and_clear_cpu(smp_processor_id(), &msdata->nmi_cpus))
return false;
- raw_cpu_write(stop_machine_nmi_ctrl.nmi_enabled, false);
-
- msdata = raw_cpu_read(stop_machine_nmi_ctrl.msdata);
instrumentation_begin();
err = msdata->fn(msdata->data);
instrumentation_end();

Also, I suppose

this_cpu_write(stop_machine_nmi_ctrl.msdata, NULL);

@@ -243,6 +232,17 @@ bool noinstr stop_machine_nmi_handler(void)
return true;
}

Finally, while I do appreciate the nmi_cpus approach, I could also think another simple msdata == NULL check could be a guard here too.

Thanks,
Chang