Re: [RFC PATCH] cpuacct: per-cgroup utime/stime statistics - v1

From: Balbir Singh
Date: Wed Mar 11 2009 - 12:44:38 EST


* Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> [2009-03-11 20:46:57]:

> * Bharata B Rao <bharata@xxxxxxxxxxxxxxxxxx> [2009-03-10 18:12:08]:
>
> > Hi,
> >
> > Based on the comments received during my last post
> > (http://lkml.org/lkml/2009/2/25/129), here is a fresh attempt
> > to get per-cgroup utime/stime statistics as part of cpuacct controller.
> >
> > This patch adds a new file cpuacct.stat which displays two stats:
> > utime and stime. I wasn't too sure about the usefulness of providing
> > per-cgroup guest and steal times and hence not including them here.
> >
> > Note that I am using percpu_counter for collecting these two stats.
> > Since percpu_counter subsystem doesn't protect the readside, readers could
> > theoritically obtain incorrect values for these stats on 32bit systems.
> > I hope occasional wrong values is not too much of a concern for
> > statistics like this. If it is a problem, we have to either fix
> > percpu_counter or do it all by ourselves as Kamezawa attempted
> > for cpuacct.usage (http://lkml.org/lkml/2009/3/4/14)
> >
> > Regards,
> > Bharata.
> >
> > cpuacct: Add stime and utime statistics
> >
> > Add per-cgroup cpuacct controller statistics like the system and user
> > time consumed by the group of tasks.
> >
> > Signed-off-by: Bharata B Rao <bharata@xxxxxxxxxxxxxxxxxx>
> > Signed-off-by: Balaji Rao <balajirrao@xxxxxxxxx>
> > ---
> > Documentation/cgroups/cpuacct.txt | 8 +++
> > kernel/sched.c | 87 +++++++++++++++++++++++++++++++++++---
> > 2 files changed, 89 insertions(+), 6 deletions(-)
> >
> > --- a/Documentation/cgroups/cpuacct.txt
> > +++ b/Documentation/cgroups/cpuacct.txt
> > @@ -30,3 +30,11 @@ The above steps create a new group g1 an
> > process (bash) into it. CPU time consumed by this bash and its children
> > can be obtained from g1/cpuacct.usage and the same is accumulated in
> > /cgroups/cpuacct.usage also.
> > +
> > +cpuacct.stat file lists a few statistics which further divide the
> > +CPU time obtained by the cgroup into user and system times. Currently
> > +the following statistics are supported:
> > +
> > +utime: Time in milliseconds spent by tasks of the cgroup in user mode.
> > +stime: Time in milliseconds spent by tasks of the cgroup in kernel mode.
> > +
>
> Hi, Bharata,
>
> I did a quick run of the patch on my machine. The patch applied and
> compile cleanly, here are a few comments?
>
> 1. We could consider enhancing the patch to account for irq, softirq,
> etc time like cpustat does. Not right away, but iteratively
> 2. The accounting is converted to milliseconds, I would much rather
> export it in cputime to be consistent with other cpu accounting.
> I remember we used to return nanosecond accurate accounting and then
> moved to cputime based accounting for cpuacct.
> 3. How do we deal with CPU hotplug. Since we use a per-cpu counter,
> any hotplug would mean that the data related to the offlined CPU is
> lost. That is how the current CPU accounting system seems to work.
>

Bharata,

I tried the diff below and saw what I was looking for

diff --git a/kernel/sched.c b/kernel/sched.c
index 015155d..fadd17f 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4195,7 +4195,7 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
else
cpustat->user = cputime64_add(cpustat->user, tmp);

- cpuacct_update_stats(p, CPUACCT_STAT_UTIME, cputime_to_msecs(cputime));
+ cpuacct_update_stats(p, CPUACCT_STAT_UTIME, cputime);
/* Account for user time used */
acct_update_integrals(p);
}
@@ -4257,7 +4257,7 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
else
cpustat->system = cputime64_add(cpustat->system, tmp);

- cpuacct_update_stats(p, CPUACCT_STAT_STIME, cputime_to_msecs(cputime));
+ cpuacct_update_stats(p, CPUACCT_STAT_STIME, cputime);

/* Account for system time used */
acct_update_integrals(p);
@@ -9611,6 +9611,7 @@ static int cpuacct_stats_show(struct cgroup *cgrp, struct cftype *cft,
for (i = 0; i < CPUACCT_STAT_NSTATS; i++) {
s64 val = percpu_counter_read(&ca->cpustat[i]);
val *= cpuacct_stat_desc[i].unit;
+ val = cputime_to_clock_t(val);
cb->fill(cb, cpuacct_stat_desc[i].msg, val);
}
return 0;

1. The data is returned in clock_t
2. CPU hotplug seems to be already correctly handled by per_cpu
counters


--
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/