Re: [RFC 0/5] forced comounts for cgroups.

From: Glauber Costa
Date: Thu Sep 06 2012 - 18:39:58 EST


On 09/07/2012 01:11 AM, Paul Turner wrote:
> On Thu, Sep 6, 2012 at 1:46 PM, Tejun Heo <tj@xxxxxxxxxx> wrote:
>> Hello,
>>
>> cc'ing Dhaval and Frederic. They were interested in the subject
>> before and Dhaval was pretty vocal about cpuacct having a separate
>> hierarchy (or at least granularity).
>
> Really? Time just has _not_ borne out this use-case. I'll let Dhaval
> make a case for this but he should expect violent objection.
>

I strongly advise against physical violence. In case it is really
necessary, please break his legs only.

>> On Wed, Sep 05, 2012 at 12:04:47PM +0200, Peter Zijlstra wrote:
>>>> cpuacct is rather unique tho. I think it's gonna be silly whether the
>>>> hierarchy is unified or not.
>>>>
>>>> 1. If they always can live on the exact same hierarchy, there's no
>>>> point in having the two separate. Just merge them.
>>>>
>>>> 2. If they need differing levels of granularity, they either need to
>>>> do it completely separately as they do now or have some form of
>>>> dynamic optimization if absolutely necesary.
>>>>
>>>> So, I think that choice is rather separate from other issues. If
>>>> cpuacct is gonna be kept, I'd just keep it separate and warn that it
>>>> incurs extra overhead for the current users if for nothing else.
>>>> Otherwise, kill it or merge it into cpu.
>>>
>>> Quite, hence my 'proposal' to remove cpuacct.
>>>
>>> There was some whining last time Glauber proposed this, but the one
>>> whining never convinced and has gone away from Linux, so lets just do
>>> this.
>>>
>>> Lets make cpuacct print a deprecated msg to dmesg for a few releases and
>>> make cpu do all this.
>>
>> I like it. Currently cpuacct is the only problematic one in this
>> regard (cpuset to a much lesser extent) and it would be great to make
>> it go away.
>>
>> Dhaval, Frederic, Paul, if you guys object, please voice your
>> opinions.
>>
>>> The co-mounting stuff would have been nice for cpusets as well, knowing
>>> all your tasks are affine to a subset of cpus allows for a few
>>> optimizations (smaller cpumask iterations), but I guess we'll have to do
>>> that dynamically, we'll just have to see how ugly that is.
>>
>> Forced co-mounting sounds rather silly to me. If the two are always
>> gonna be co-mounted, why not just merge them and switch the
>> functionality depending on configuration? I'm fairly sure the code
>> would be simpler that way.
>
> It would be simpler but the problem is we'd break any userspace that
> was just doing mount cpuacct?
>
> Further, even if it were mounting both, userspace code still has to be
> changed to read from "cpu.export" instead of "cpuacct.export".
>

Only if we remove cpuacct. What we can do, and I thought about doing, is
just merging cpuacct functionality into cpu. Then we move cpuacct to
default no. It will be there for userspace if they absolutely want to
use it.

> I think a sane path on this front is:
>
> Immediately:
> Don't allow cpuacct and cpu to be co-mounted on separate hierarchies
> simultaneously.
>
that is precisely what my patch does, except it is a bit more generic.

> That is:
> mount none /dev/cgroup/cpuacct -t cgroupfs -o cpuacct : still works
> mount none /dev/cgroup/cpu -t cgroupfs -o cpu : still works
> mount none /dev/cgroup/cpux -t cgroupfs -o cpuacct,cpu : still works
>
> But the combination:
> mount none /dev/cgroup/cpu -t cgroupfs -o cpu : still works
> mount none /dev/cgroup/cpuacct -t cgroupfs -o cpu : EINVAL [or vice versa].
>
> Also:
> WARN_ON when mounting cpuacct without cpu, strongly explaining that
> ANY such configuration is deprecated.
>
> Glauber's patchset goes most of the way towards enabling this.
>
yes.

> In a release or two:
> Make the restriction strict; don't allow individual mounting of
> cpuacct, force it to be mounted ONLY with cpu.
>
> Glauber's patchset gives us this.
>
> Finally:
> Mirror the interfaces to cpu, print nasty syslog messages about ANY
> mounts of cpuacct
> Follow that up by eventually removing cpuacct completely
>
Why don't start with mirroring? It gives more time for people to start
switching to it.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/