Re: [PATCH 1/7] stop_machine: Introduce stop_machine_nmi()

Next message: Larysa Zaremba: "[PATCH iwl-next v3 08/10] ixgbevf: add pseudo header split"
Previous message: Larysa Zaremba: "[PATCH iwl-next v3 06/10] ixgbevf: XDP_TX in multi-buffer through libeth"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Borislav Petkov

Date: Wed Mar 04 2026 - 11:39:32 EST

On Thu, Feb 05, 2026 at 06:14:39PM -0800, Chang S. Bae wrote:
> On 2/2/2026 2:54 AM, Borislav Petkov wrote:
> ...
> > @@ -174,8 +174,26 @@ struct multi_stop_data {
> > enum multi_stop_state state;
> > atomic_t thread_ack;
> > +
> > + bool use_nmi;
> > +
> > + /*
> > + * cpumasks of CPUs on which to raise an NMI; used in the NMI
> > + * stomp_machine variant. nmi_cpus_done is used for tracking
> > + * when the NMI handler has executed successfully.
> > + */
> > + struct cpumask nmi_cpus;
> > + struct cpumask nmi_cpus_done;
> > +
> > +};
>
> Looks like every stop_machine variant then will spend stack for these masks.
> It seems they could be cpumask_var_t.

I guess...

> Alternatively, to make it simple further, a per-CPU variable could achieve
> this if I understand correctly:
>
> struct stop_machine_nmi_ctrl {
> ...
> bool done;
> }

The first mask - nmi_cpus guards from the NMI handler running again. The
second one checks whether all CPUs ran the NMI handler.

I guess simply checking whether the nmi_cpus mask is *not* empty, would tell
us that too so we probably are fine with a single mask only.

> I don't know whether that was an intentional design choice or not. But, at
> least the NMI variant might have a slight different semantic in this regard.

This current behavior doesn't make a whole lot of sense to me - at least from
what I'm reading. I think it is clearly better if the caller gets told when
some NMI handler failed instead of overwriting an error val.

But maybe we'll fix that while we're at it.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette