Re: [patch 3/4] taskstats: Introduce cdata_acct for completecumulative accounting

From: Oleg Nesterov
Date: Tue Nov 23 2010 - 12:06:49 EST


On 11/19, Michael Holzheu wrote:
>
> Currently the cumulative time accounting in Linux is not complete.
> Due to POSIX POSIX.1-2001, the CPU time of processes is not accounted
> to the cumulative time of the parents, if the parents ignore SIGCHLD
> or have set SA_NOCLDWAIT. This behaviour has the major drawback that
> it is not possible to calculate all consumed CPU time of a system by
> looking at the current tasks. CPU time can be lost.
>
> This patch adds a new set of cumulative time counters. We then have two
> cumulative counter sets:
>
> * cdata_wait: Traditional cumulative time used e.g. by getrusage.
> * cdata_acct: Cumulative time that also includes dead processes with
> parents that ignore SIGCHLD or have set SA_NOCLDWAIT.
> cdata_acct will be exported by taskstats.

Looks correct at first glance. A couple of nits below.

> TODO:
> -----
> With this patch we take the siglock twice. First for the dead task
> and second for the parent of the dead task. This give the following
> lockdep warning (probably a lockdep annotation is needed here):

And we already discussed this ;) We do not need 2 siglock's, only
parent's. Just move the callsite in __exit_signal() down, under
another (lockless) group_dead check.

Or I missed something?

> @@ -595,6 +595,8 @@ struct signal_struct {
> */
> struct cdata cdata_wait;
> struct cdata cdata_threads;
> + struct cdata cdata_acct;
> + struct task_io_accounting ioac_acct;
> struct task_io_accounting ioac;

Given that task_io_accounting is Linux specific, perhaps we can use
signal->ioac in both cases?

Yes, this is a user-visible change anyway. But, at least we can
forget about POSIX.

> + spin_lock_irqsave(&p->real_parent->sighand->siglock, flags);
> + if (wait) {
> + pcd = &p->real_parent->signal->cdata_wait;
> + tcd = &p->signal->cdata_threads;
> + cd = &p->signal->cdata_wait;
> + } else {
> + pcd = &p->real_parent->signal->cdata_acct;
> + tcd = &p->signal->cdata_threads;
> + cd = &p->signal->cdata_acct;
> + }

We can do this before taking ->siglock. Not that I think this really
matters, but otherwise this looks a bit confusing imho, as if we need
parent's ->siglock to pin something.


And thanks for splitting these changes. It was much, much easier to
read now.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/