Re: [PATCH v3 1/2] cpuhp: Optimize SMT switch operation by batching lock acquisition
From: Vishal Chourasia
Date: Thu Mar 26 2026 - 06:10:31 EST
Hi Thomas, Thank you for the review.
Numbers from 400 CPUs that I had while back,
baseline: Linux 6.19.0-rc4-00310-g755bc1335e3b
On PPC64 system with 400 CPUs:
SMT8 to SMT1:
baseline: real 1m14.792s
baseline+patch: real 0m03.205s # ~23x improvement
SMT1 to SMT8:
baseline: real 2m27.695s
baseline+patch: real 0m02.510s # ~58x improvement
Note: We observe huge improvements for max config system which
originally took approx to 1 hour to switch SMT states, with GPs
expedited is taking 5 to 6 minutes.
Analysis: why expediting GPs improves time to complete
By expediting the grace period, we force an immediate IPI-driven
quiescent state detection across all CPUs rather than lazily waiting,
which dramatically reduces the time the calling thread remains blocked
in synchronize_rcu()
Why holding the cpus_write_lock() for the duration of SMT switch will
not work? [1] This causes hung-task timeout splats [2] because there are
threads blocked on cpus_read_lock(). Expediting grace periods shrinks
the window but doesn't eliminate it. I plan to drop this patch and the
next version will only carry the expedited RCU grace period change.
I will incorporate all your other suggestions in the next version.
[1] https://lore.kernel.org/all/20260113090153.GS830755@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
[2] https://lore.kernel.org/all/aapprY-prH0l_WeK@xxxxxxxxxxxxx/
On Wed, Mar 25, 2026 at 08:09:17PM +0100, Thomas Gleixner wrote:
> On Wed, Feb 18 2026 at 14:09, Vishal Chourasia wrote:
> > From: Joel Fernandes <joelagnelf@xxxxxxxxxx>
> >
> > Bulk CPU hotplug operations, such as an SMT switch operation, requires
> > hotplugging multiple CPUs. The current implementation takes
> > cpus_write_lock() for each individual CPU, causing multiple slow grace
> > period requests.
> >
> > Introduce cpu_up_locked() and cpu_down_locked() that assume the caller
> > already holds cpus_write_lock(). The cpuhp_smt_enable() and
> > cpuhp_smt_disable() functions are updated to hold the lock once around
> > the entire loop, rather than for each individual CPU.
> >
> > Link: https://lore.kernel.org/all/20260113090153.GS830755@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> > Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Signed-off-by: Vishal Chourasia <vishalc@xxxxxxxxxxxxx>
>
> You dropped Joel's Signed-off-by ....
Sorry for messing up the changelog w.r.t to signed-off-by tag.
Will take care in future.
>
> > -/* Requires cpu_add_remove_lock to be held */
> > -static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
> > +/* Requires cpu_add_remove_lock and cpus_write_lock to be held */
> > +static int __ref cpu_down_locked(unsigned int cpu, int tasks_frozen,
> > enum cpuhp_state target)
>
> No line break required. You have 100 chars. If you still need one:
>
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html
Ack.
>
> > */
> > if (cpumask_any_and(cpu_online_mask,
> > housekeeping_cpumask(HK_TYPE_DOMAIN)) >= nr_cpu_ids) {
> > - ret = -EBUSY;
> > - goto out;
> > + return -EBUSY;
> > }
>
> Please remove the brackets. They are not longer required. All over the place.
Ack.
>
> > +static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
> > + enum cpuhp_state target)
> > +{
> > +
> > + int ret;
> > + cpus_write_lock();
>
> Coding style...
Ack.
>
> > + ret = cpu_down_locked(cpu, tasks_frozen, target);
> > cpus_write_unlock();
> > arch_smt_update();
> > return ret;
> > @@ -2659,6 +2674,16 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> > int cpu, ret = 0;
> >
> > cpu_maps_update_begin();
> > + if (cpu_hotplug_offline_disabled) {
> > + ret = -EOPNOTSUPP;
> > + goto out;
> > + }
> > + if (cpu_hotplug_disabled) {
> > + ret = -EBUSY;
> > + goto out;
> > + }
> > + /* Hold cpus_write_lock() for entire batch operation. */
> > + cpus_write_lock();
>
> .... for the entire ...
>
> And please visiually separate things. Newlines exist for a reason.
Sure.
>
> Thanks,
>
> tglx
Thanks and Regards!
Vishalc