Re: [BUG] cgroups/cpusets: Spurious CPU-hotplug failures

From: Waiman Long

Date: Tue Mar 24 2026 - 20:04:32 EST

On 3/24/26 5:41 AM, Paul E. McKenney wrote:

On Wed, Mar 18, 2026 at 11:43:37AM -0700, Paul E. McKenney wrote:

On Wed, Mar 18, 2026 at 11:02:16AM -0400, Waiman Long wrote:

On 3/18/26 8:53 AM, Paul E. McKenney wrote:

Hello!

Running rcutorture on v7.0-rc3 results in spurious CPU-hotplug failures,
most frequently on the TREE03 scenario, which suffers about ten such
failures per hundred hours of test time. Repeat-by is as follows:

tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 80 --duration 100h --configs "100*TREE03" --trust-make

Though a faster repeat-by instead uses kvm-remote.sh and lots of systems.

Bisection converges here:

6df415aa46ec ("cgroup/cpuset: Defer housekeeping_update() calls from CPU hotplug to workqueue")

Reverting this commit gets rid of the spurious CPU-hotplug failures.
Of course, this also gets rid of some ability to do dynamic nohz_full
processing.

Now, the problem might be that the workqueue handler might still be
in flight by the time that rcutorture fired up the next CPU-hotplug
operation, especially given that the TREE03 scenario only waits 200
milliseconds between these operations. This suggests waiting for this
handler before ending each CPU-hotplug operation. And the crude patch
below does make the problem go away.

This alleged fix is quite heavy-handed, and also fragile in that if
hk_sd_workfn() uses a different workqueue, this breaks. It might be
better to call into the cgroups/cpusets code and to use flush_work()
to wait only on hk_sd_workfn() and nothing else. But it seemed best to
keep things trivial to start with.

Either way, please consider the patch below to be part of this bug report
rather than a proper fix.

Thoughts?

Thanx, Paul

There is a fix commit ca174c705db5 ("cgroup/cpuset: Call
rebuild_sched_domains() directly in hotplug") in rc4 that may help. Could
you try out the rc4 kernel to see if that can resolve the problem that you
have?

It does, thank you!

Tested-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

This did fix the problem, except for PREEMPT_RT kernels (which I have
not yet bisected). If there is another patch for that configuration,
could you please let me know?

Thank for the notice. I haven't much testing with respect to PREEMPT_RT kernel. I will try to run some tests on PREEMPT_RT kernel and see if there is any problem. Please let me know if you found out the new cpuset code is at fault after bisection.

Cheers,
Longman