Re: [sched next] overflowed cpu time for kernel threads in/proc/PID/stat

From: Stanislaw Gruszka
Date: Tue Sep 03 2013 - 04:45:26 EST


On Mon, Sep 02, 2013 at 05:00:15PM +0300, Sergey Senozhatsky wrote:
> > > Thanks a lot Sergey for testing this further!
> > >
> > > Interesting results, so rtime is always one or two units off stime after scaling.
> > > Stanislaw made the scaling code with Linus and he has a better idea on the math guts
> > > here.
> >
> > I don't think this is scale issue, but rather at scale_stime() input
> > stime is already bigger then rtime. Sergey, could you verify that
> > by adding check before scale_stime() ?
> >
>
> usually stime < rtime.
> this is what scale_stime() gets as input:
>
> [ 1291.409566] stime:3790580815 rtime:4344293130 total:3790580815

Ok, I see now, utime is 0 . This seems to be problem with dynamic ticks
as you told that your application is kernel compilation, so we utilize
lot of cpu time in user-space.

Anyway we should handle utime == 0 situation on scaling code. We work
well when rtime & stime are not big (variables and results fit in
32 bit), otherwise we have that stime bigger than rtime problem. Let's
try to handle the problem by below patch. Sergey, does it work for you ?

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index a7959e0..25cc35d 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -557,7 +557,7 @@ static void cputime_adjust(struct task_cputime *curr,
struct cputime *prev,
cputime_t *ut, cputime_t *st)
{
- cputime_t rtime, stime, utime, total;
+ cputime_t rtime, stime, utime;

if (vtime_accounting_enabled()) {
*ut = curr->utime;
@@ -565,9 +565,6 @@ static void cputime_adjust(struct task_cputime *curr,
return;
}

- stime = curr->stime;
- total = stime + curr->utime;
-
/*
* Tick based cputime accounting depend on random scheduling
* timeslices of a task to be interrupted or not by the timer.
@@ -588,13 +585,19 @@ static void cputime_adjust(struct task_cputime *curr,
if (prev->stime + prev->utime >= rtime)
goto out;

- if (total) {
+ stime = curr->stime;
+ utime = curr->utime;
+
+ if (utime == 0) {
+ stime = rtime;
+ } else if (stime == 0) {
+ utime = rtime;
+ } else {
+ cputime_t total = stime + utime;
+
stime = scale_stime((__force u64)stime,
(__force u64)rtime, (__force u64)total);
utime = rtime - stime;
- } else {
- stime = rtime;
- utime = 0;
}

/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/