Re: [PATCH] Re: /proc/uptime idle counter remains at 0

From: Michael Abbott
Date: Mon Jul 06 2009 - 12:11:38 EST


Looks like I dropped the ball on this. Didn't realise the patch would
just evaporate...

On Mon, 25 May 2009, Martin Schwidefsky wrote:
> On Mon, 18 May 2009 14:23:03 +0100 (BST)
> Michael Abbott <michael@xxxxxxxxxxxxxxx> wrote:
>
> > diff --git a/fs/proc/uptime.c b/fs/proc/uptime.c
> > index 0c10a0b..0f43395 100644
> > --- a/fs/proc/uptime.c
> > +++ b/fs/proc/uptime.c
> > @@ -4,13 +4,19 @@
> > #include <linux/sched.h>
> > #include <linux/seq_file.h>
> > #include <linux/time.h>
> > +#include <linux/kernel_stat.h>
> > #include <asm/cputime.h>
> >
> > static int uptime_proc_show(struct seq_file *m, void *v)
> > {
> > struct timespec uptime;
> > struct timespec idle;
> > - cputime_t idletime = cputime_add(init_task.utime, init_task.stime);
> > + int len, i;
> > + cputime_t idletime = 0;
> > +
> > + for_each_possible_cpu(i)
> > + idletime = cputime64_add(idletime, kstat_cpu(i).cpustat.idle);
> > + idletime = cputime64_to_clock_t(idletime);
> >
> > do_posix_clock_monotonic_gettime(&uptime);
> > monotonic_to_bootbased(&uptime);
>
> I found another problem with this patch: why do you convert the
> idletime from cputime_t to clock_t ? The call to cputime_to_timespec
> takes a cputime_t and the conversion from cputime to clock_t returns a
> value that is way to small on s390. After removing that line it works
> for me but I wonder why you added it.

I've tidied up the patch a trifle, the idle time processing is now very
nearly an exactly copy of that in fs/proc/stat.c.

Unfortunately this now begs more questions:

1. Why is idle time processing so different from up time, in particular
why am I still allowing uptime to overflow in a year and four months or
so?

Answer: I'm trying to just fix idle time, and the uptime calculation looks
like a can of worms to me.

2. Is this really the right calculation of idle time, in particular what
about multiple CPUs?

Answer: I don't have a multi-process test system, so I'm not qualified to
investigate. My view is that idle time should match uptime, ie should be
in literal elapsed machine time (so should really be divided by the number
of CPUs here), but it's clear that doing this properly is not
straighforward.

As for whether it is "safe" to compute idle time this way, this code is
just a copy of what is done in fs/proc/stat.c, so worry about that first!


It's rather a shame that /proc/uptime idle time reporting has been busted
now for what will be three kernel releases in a row.