Re: [PATCH v11 7/9] cpuset: Expose cpus.effective and mems.effective on cgroup v2 root

From: Waiman Long
Date: Mon Jul 02 2018 - 20:41:45 EST

On 07/03/2018 12:53 AM, Tejun Heo wrote:
> Hello, Waiman.
> On Sun, Jun 24, 2018 at 03:30:38PM +0800, Waiman Long wrote:
>> Because of the fact that setting the "cpuset.sched.partition" in
>> a direct child of root can remove CPUs from the root's effective CPU
>> list, it makes sense to know what CPUs are left in the root cgroup for
>> scheduling purpose. So the "cpuset.cpus.effective" control file is now
>> exposed in the v2 cgroup root.
> So, effective changing when enabling partition on a child feels wrong
> to me. It's supposed to contain what's actually allowed to the cgroup
> from its parent and that shouldn't change regardless of how those
> resources are used. It's still given to the cgroup from its parent.

Another way to work around this issue is to expose the reserved_cpus in
the parent for holding CPUs that can taken by a chid partition. That
will require adding one more cpuset file for those cgroups that are
partition roots.

> It's a bit different because the way partition behaves is different
> from other resource konbs in that it locks away those cpus so that
> they can't be taken back.
> What do people think about restricting partition to the first level
> children for now at least? That way we aren't locked into the special
> semantics and we can figure out how to this down the hierarchy later.
> Given that we ignore the regular cpuset settings when the set goes
> empty (which also is a special condition which only exists for cpuset)
> and inherits the parent's, I think the consistent thing to do is doing
> the same for partition - if it can't be satisfied, ignore it, but
> maybe there is a better way.

I don't mind restricting that to the first level children for now. That
does restrict where we can put the container root if we want a separate
partition for a container. Let's hear if others have any objection about