[PATCH v2 06/12] smp: Enable preemption early in smp_call_function_many_cond

From: Chuyi Zhou

Date: Mon Mar 02 2026 - 02:54:14 EST


Now smp_call_function_many_cond() disables preemption mainly for the
following reasons:

- To prevent the remote online CPU from going offline. Specifically, we
want to ensure that no new csds are queued after smpcfd_dying_cpu() has
finished. Therefore, preemption must be disabled until all necessary IPIs
are sent.

- To prevent migration to another CPU, which also implicitly prevents the
current CPU from going offline (since stop_machine requires preempting the
current task to execute offline callbacks).

- To protect the per-cpu cfd_data from concurrent modification by other
smp_call_*() on the current CPU. cfd_data contains cpumasks and per-cpu
csds. Before enqueueing a csd, we block on the csd_lock() to ensure the
previous asyc csd->func() has completed, and then initialize csd->func and
csd->info. After sending the IPI, we spin-wait for the remote CPU to call
csd_unlock(). Actually the csd_lock mechanism already guarantees csd
serialization. If preemption occurs during csd_lock_wait, other concurrent
smp_call_function_many_cond calls will simply block until the previous
csd->func() completes:

task A task B

sd->func = fun_a
send ipis

preempted by B
--------------->
csd_lock(csd); // block until last
// fun_a finished

csd->func = func_b;
csd->info = info;
...
send ipis

switch back to A
<---------------

csd_lock_wait(csd); // block until remote finish func_*

This patch enables preemption before csd_lock_wait() which makes the
potentially unpredictable csd_lock_wait() preemptible and migratable.
Note that being migrated to another CPU and calling csd_lock_wait() may
cause UAF due to smpcfd_dead_cpu() during the current CPU offline process.
Previous patch used the RCU mechanism to synchronize csd_lock_wait()
with smpcfd_dead_cpu() to prevent the above UAF issue.

Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
kernel/smp.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index ad6073b71bbd..18e7e4a8f1b6 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -801,7 +801,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
smp_cond_func_t cond_func)
{
bool preemptible_wait = !IS_ENABLED(CONFIG_CPUMASK_OFFSTACK);
- int cpu, last_cpu, this_cpu = smp_processor_id();
+ int cpu, last_cpu, this_cpu;
struct call_function_data *cfd;
bool wait = scf_flags & SCF_WAIT;
cpumask_var_t cpumask_stack;
@@ -809,9 +809,9 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
int nr_cpus = 0;
bool run_remote = false;

- lockdep_assert_preemption_disabled();
-
rcu_read_lock();
+ this_cpu = get_cpu();
+
cfd = this_cpu_ptr(&cfd_data);
cpumask = cfd->cpumask;

@@ -898,6 +898,19 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
local_irq_restore(flags);
}

+ /*
+ * We may block in csd_lock_wait() for a significant amount of time,
+ * especially when interrupts are disabled or with a large number of
+ * remote CPUs. Try to enable preemption before csd_lock_wait().
+ *
+ * Use the cpumask_stack instead of cfd->cpumask to avoid concurrency
+ * modification from tasks on the same cpu. If preemption occurs during
+ * csd_lock_wait, other concurrent smp_call_function_many_cond() calls
+ * will simply block until the previous csd->func() complete.
+ */
+ if (preemptible_wait)
+ put_cpu();
+
if (run_remote && wait) {
for_each_cpu(cpu, cpumask) {
call_single_data_t *csd;
@@ -907,7 +920,9 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
}
}

- if (preemptible_wait)
+ if (!preemptible_wait)
+ put_cpu();
+ else
free_cpumask_var(cpumask_stack);
rcu_read_unlock();
}
--
2.20.1