Re: Clue on 2.0.33 crashes

Ulrich Windl (ulrich.windl@rz.uni-regensburg.de)
Tue, 24 Feb 1998 08:03:52 +0100


On 23 Feb 98 at 10:02, Thomas Schenk wrote:

> > I think I have answered such a question before. For i386 architecture
> > no problem arises. Really! For other architectures I don't know,

Maybe I should add: Without SMP. My 486@33MHz here is running 2.0.32
with xntpd3-5.91 for 77 days and 23 hours now. It never crashed. I
also run Pentiums from time to time which do not crash. I suspect
another SMP interaction (xntpd uses SCHED_FIFO for example).

> > because the state as of 2.0.33 seems inconsistent. If you add one of
> > my recent PPSkits I tried to make the implementation consistent
> > again. I have never tried that, but still I don't believe that the
> > system crashes because of that.
>
> It is not my contention that xntpd is causing the crash directly, but that
> the unstable time state (paraphrasing your own words) that results when
> xntpd adjusts the time (as opposed to stepping the time like ntpdate does)
> eventually leads to a crash on our SMP systems. I just know that by not
> running xntpd and using ntpdate -s -b hostname, that we have experienced
> far fewer crashes than before.

If you look at the patch, the only changes are:

+ Renamed some constants
+ changed the value for NTP_PHASE_LIMIT
+ set stime_state to TIME_ERROR where required
+ avoid adjtime after a settimeofday (by setting time_adjust to
zero)
+ added a message that might indicate a kernel bug (never appeared
yet)
+ Corrected the logic for RTC updates (previously your CMOS clock
might never be updated, depending on tick)
+ update_wall_time_one_tick should be essentially the same
+ adjtimex has been rewritten to do proper checking of parameters

>
> >
> > BTW: I have posted an article in comp.protocols.time.ntp about
> > "kernel pll status change 89"...
>
> I read this the first time I received this syslog message. It was very
> informative, but did not really allay my suspicions about xnptd's
> interaction with the new kernel pll code.

If you really care, try to undo my changes and retry. If there is
really something I can't see, we could try to fix it.

>
> I hope that you did not take this as a personal attack as it was not meant
> that way at all. I was merely offering a data point on a phenomenon that
> we observed in that hope that it might help to resolve the problem.

If you only had a more precise suspect. I have no idea where to
search. Do you have a kernel-crash-program that really works?

>
> BTW, where can I find out about the PPSkits you mentioned.

So if there search machines and webcrawlers failed, have a look at
ftp://pcphy4.physik.uni-regensburg.de/pub/wiu09524/PPS/PPSkit-*tar.gz

(0.3.5 is current)

(Don't use a http connection unless you are interested in algae ;-)

Ulrich

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu