Re: [PATCH] Re: boot time, process start time, and NOW time

From: john stultz
Date: Tue Aug 17 2004 - 17:39:59 EST


On Tue, 2004-08-17 at 15:24, George Anzinger wrote:
> Tim Schmielau wrote:
> > On Mon, 16 Aug 2004, Andrew Morton wrote:
> >
> >
> >>OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx> wrote:
> >>
> >>>Albert Cahalan <albert@xxxxxxxxxxxx> writes:
> >>>
> >>>
> >>>>Even with the 2.6.7 kernel, I'm still getting reports of process
> >>>>start times wandering. Here is an example:
> >>>>
> >>>> "About 12 hours since reboot to 2.6.7 there was already a
> >>>> difference of about 7 seconds between the real start time
> >>>> and the start time reported by ps. Now, 24 hours since reboot
> >>>> the difference is 10 seconds."
> >>>>
> >>>>The calculation used is:
> >>>>
> >>>> now - uptime + time_from_boot_to_process_start
> >>>
> >>>Start-time and uptime is using different source. Looks like the
> >>>jiffies was added bogus lost counts.
> >>>
> >>>quick hack. Does this change the behavior?
> >>
> >>Where did this all end up? Complaints about wandering start times are
> >>persistent, and it'd be nice to get some fix in place...
> >>
> >>Thanks.
> >>
> >
> >
> > Seems my analysis of the problem wasn't perceived as such.
> >
> > The problem is that in the above calculation
> >
> > now - uptime + time_from_boot_to_process_start
> >
> > "uptime" currently is an ntp-corrected precise time, while
> > "time_from_boot_to_process_start" just is the free-running "jiffies"
> > value.
>
> I see you think you have the solution, but I guess I am just dense here. May be
> you could help me to see the error of my ways. Here is my thinking:
>
> "now" is from gettimeofday() and as such is ntp corrected.
> "uptime" is also corrected. In fact it is "now" + "wall_to_monotonic". And
> "wall_to_monotonic" is _only_ changed by do_settime() when the clock is set.
> "time_from_boot_to_process_start" is the same as "start_time" restated in
> seconds, i.e. it is a constant. So, either one or more of the above assumtions
> is wrong, or somebody is twiddling the clock. Otherwise I don't see how the
> start time can move at all.

The problem is start time is derived from task->start_time which is the
jiffies value at the time the process started. Thus interval calculated
by: (start_time = p->start_time - INITIAL_JIFFIES) or (run_time =
get_jiffies_64() - p->start_time) is not NTP adjusted.

So both (uptime - run_time) or (boot_time + start_time) will have
problems.

What needs to happen is task->start_time is changed to a timespec which
is set at fork time to be do_posix_clock_monotonic_gettime(). Then in
proc_pid_stat() we can calculate the appropriate user-jiffies value.


task->start_time is used at the following lines:

include/linux/sched.h: 460
kernel/fork.c: 964
fs/proc/array.h: 359
kernel/acct.c: 404
mm/oom_kill.c: 64

I'm stuck trying to fix the last two files at the moment. Please let me
know if you see any other uses.

thanks
-john


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/