Re: [PATCH RFC] cpuset: Make cpusets get restored on hotplug

From: Waiman Long
Date: Tue Oct 26 2021 - 22:35:51 EST


On 10/26/21 10:21 PM, Barry Song wrote:
On Wed, Oct 27, 2021 at 2:06 PM Waiman Long <longman@xxxxxxxxxx> wrote:

On 10/26/21 7:58 PM, Barry Song wrote:
I think Tejun is concerned about a change in the default behavior of
cpuset v1.

There is a special v2 mode for cpuset that is enabled by the mount
option "cpuset_v2_mode". This causes the cpuset v1 to adopt some of the
v2 behavior. I introduced this v2 mode a while back to address, I think,
a similar concern. Could you try that to see if it is able to address
your problem? If not, you can make some code adjustment within the
framework of the v2 mode. As long as it is an opt-in, I think we are
open to further change.
I am also able to reproduce on Ubuntu 21.04 LTS.

all docker will be put in this cgroups and its child cgroups:
/sys/fs/cgroup/cpuset/docker

disabling and enabling SMT by:
echo off > /sys/devices/system/cpu/smt/control
echo on > /sys/devices/system/cpu/smt/control

or unpluging and pluging CPUs by:
echo 0 > /sys/devices/system/cpu/cpuX/online
echo 1 > /sys/devices/system/cpu/cpuX/online

then all docker images will lose some CPUs.

So should we document the broken behaviours somewhere?
Is the special cpuset_v2_mode mount option able to fix the issue?

This mode is documented in

Documentation/admin-guide/cgroup-v1/cpuset.rst:

The cpuset.effective_cpus and cpuset.effective_mems files are
normally read-only copies of cpuset.cpus and cpuset.mems files
respectively. If the cpuset cgroup filesystem is mounted with the
special "cpuset_v2_mode" option, the behavior of these files will become
similar to the corresponding files in cpuset v2. In other words, hotplug
events will not change cpuset.cpus and cpuset.mems. Those events will
only affect cpuset.effective_cpus and cpuset.effective_mems which show
the actual cpus and memory nodes that are currently used by this cpuset.
See Documentation/admin-guide/cgroup-v2.rst for more information about
cpuset v2 behavior.

Maybe we can make it more visible.
Is it possible to make cpuset_v2_mode true in default? not quite sure if
it will harm something.

The cpuset_v2_mode is a change in v1 behavior and that is why it is an opt-in as we don't want to break existing applications that have a dependency on the current v1 behavior. If users switch to use cgroup v2, they get the new behavior. Alternately, they can modify the system startup script to use the v2 behavior by using the mount option. I don't think we are going to change the v1 default behavior.

Cheers,
Longman