Re: [RFC/RFT PATCH v3] sched: automated per tty task groups

From: Vivek Goyal
Date: Mon Nov 15 2010 - 20:57:21 EST

Next message: Bruno Randolf: "[PATCH v8 1/3] Add generic exponentially weighted moving average(EWMA) function"
Previous message: Paul Menage: "Re: [RFC/RFT PATCH v3] sched: automated per tty task groups"
In reply to: Peter Zijlstra: "Re: [RFC/RFT PATCH v3] sched: automated per tty task groups"
Next in thread: Linus Torvalds: "Re: [RFC/RFT PATCH v3] sched: automated per tty task groups"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Nov 15, 2010 at 02:25:50PM -0700, Mike Galbraith wrote:

[..]
>
> A recurring complaint from CFS users is that parallel kbuild has a negative
> impact on desktop interactivity. This patch implements an idea from Linus,
> to automatically create task groups. This patch only implements Linus' per
> tty task group suggestion, and only for fair class tasks, but leaves the way
> open for enhancement.
>
> Implementation: each task's signal struct contains an inherited pointer to a
> refcounted autogroup struct containing a task group pointer, the default for
> all tasks pointing to the init_task_group. When a task calls __proc_set_tty(),
> the process wide reference to the default group is dropped, a new task group is
> created, and the process is moved into the new task group. Children thereafter
> inherit this task group, and increase it's refcount. On exit, a reference to the
> current task group is dropped when the last reference to each signal struct is
> dropped. The task group is destroyed when the last signal struct referencing
> it is freed. At runqueue selection time, IFF a task has no cgroup assignment,
> it's current autogroup is used.

Mike,

IIUC, this automatically created task group is invisible to user space? I
mean generally there is a task group associated with a cgroup and user space
tools can walk through cgroup hierarchy to figure out how system is
configured. Will that be possible with this patch.

I am wondering what will happen to things like some kind of per cgroup
stats. For example block controller keeps track of number of sectors
transferred per cgroup. Hence this information will not be available for
these internal task groups?

Looks like everybody likes the idea but let me still ask the following
question.

Should this kind of thing be done in user space? I mean what we are
essentially doing providing isolation between two groups. That's why
this cgroup infrastructure is in place. Just that currently how cgroups
are created is fully depends on user space and kernel does not create
cgroups of its own by default (ecept root cgroup).

I think systemd does something similar in the sense every system service
it puts in a cgroup of its own on system startup.

libcgroup daemon has the facility to listen for kernel events (through
netlink socket), and then put newly created tasks in cgroups as per
the user spcefied rules in a config file. For example, if one wants
isolation between tasks of two user ids, one can just write a rule and
once the user logs in, its login session will be automatically placed
in right cgroup. Hence one will be able to achieve isolation between
two users. I think now it also has rules for classifying executables
based on names/paths. So one can put "firefox" in one cgroup and say
"make -j64" in a separate cgroup and provide isolation between two
applications. It is just a matter of putting right rule in the config file.

This patch sounds like an extension to user id problem where we want
isolation between the processes of same user (process groups using
different terminals). Would it make sense to generate some kind of kernel
event for this and let user space execute the rules instead of creating
this functionality in kernel.

This way once we extend this functionality to other subsystems, we can
make it more flexible in user space. For example, create these groups
for cpu controller but lets say not for block controller. Otherwise
we will end up creating more kernel tunables for achieving same effect.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Bruno Randolf: "[PATCH v8 1/3] Add generic exponentially weighted moving average(EWMA) function"
Previous message: Paul Menage: "Re: [RFC/RFT PATCH v3] sched: automated per tty task groups"
In reply to: Peter Zijlstra: "Re: [RFC/RFT PATCH v3] sched: automated per tty task groups"
Next in thread: Linus Torvalds: "Re: [RFC/RFT PATCH v3] sched: automated per tty task groups"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]