Re: [RFD] Task counter: cgroup core feature or cgroup subsystem? (wasRe: [PATCH 0/8 v3] cgroups: Task counter subsystem)

From: Paul Menage
Date: Fri Aug 26 2011 - 11:16:56 EST


On Wed, Aug 24, 2011 at 10:54 AM, Frederic Weisbecker
<fweisbec@xxxxxxxxx> wrote:
>
> It seems your patch doesn't handle the ->fork() and ->exit() calls.
> We probably need a quick access to states of multi-subsystems from
> the task, some lists available from task->cgroups, I don't know yet.
>

That state is available, but currently only while holding cgroup_mutex
- at least, that's what task_cgroup_from_root() requires.

It might be the case that we could achieve the same effect by just
locking the task, so the pre-condition for task_cgroup_from_root()
would be either that cgroup_mutex is held or the task lock is held.

We could extend the signature of cgroup_subsys.fork to include a
reference to the cgroup; for the singly-bindable subsystems this would
be trivially available via task->cgroups; for the multi-bindable
subsystems then for each hierarchy that the subsystem is mounted on
we'd call task_cgroup_from_root() to get the cgroup for that
hierarchy. So multi-bindable subsystems with fork/exit callbacks would
get called once for each mounted instance of the subsystem.

This would still make the task counter subsystem a bit painful - it
would read_lock a global rwlock (css_set_lock) on every fork/exit in
order to find the cgroup to charge/uncharge. I'm not sure how painful
that would be on a big system. If that were a noticeable performance
problem, we could have a variable-length extension on the end of
css_set that contains a list of hierarchy_index/cgroup pairs for any
hierarchies that had multi-bindable subsystems (or maybe for all
hierarchies, for simplicity). This would make creating a css_set a
little bit more complicated, but overall shouldn't be too painful, and
would make the problem of finding a cgroup for a given hierarchy
trivial.

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/