Problem with NTP on (embedded) PPC, patch and RFC

From: Giovambattista Pulcini
Date: Fri Mar 11 2005 - 08:17:57 EST


Hi,

On an embedded device based on the IBM 405GP, but this may be a general problem for most PPC platforms except for chrp and gemini, the NTP utility 'ntptime' always returns error code 5 (TIME_ERROR) even after that NTP status reaches the PLL and FLL state. Analysis of problem showed that the time_state variable set to TIME_ERROR by do_settimeofday() is never set back to TIME_OK.
I found the problem in 2.4.10-1 (Lynuxworks BlueCat) but I also checked the 2.6.11 and found similar problem. Many architectures under arch/ppc may be affected with the exception of chrp and gemini.

Steps to reproduce:
On a PowerPC (non-CHRP) platform, set the system date with 'date', configure and start the NTP daemon as client of a working NTP server. Wait for it to reach the PLL/FLL state. Issue the 'ntptime' command and check that the following two errors never disappear no matter how long you let it running: "ntp_gettime() returns code 5 (ERROR)", "ntp_adjtime() returns code 5 (ERROR)".

Detailed analysis:
AFAIK NTP relies on the global time_state variable which is statically initialized to TIME_OK (kernel/timer.c). The ntptime utility calls adjtimex() which results in a call to do_adjtimex() and prints its return value which is basically the value of time_state. It is changed by (kernel/timer.c)second_overflow() and by the (kernel/time.c)do_adjtimex() state machine.
These two functions never set time_state to TIME_OK once it has been set to TIME_ERROR.
Also, do_settimeofday() sets the STA_UNSYNC flag in time_status and sets time_state to TIME_ERROR (in ppc but not in ppc64 nor in x86).
The function (arch/ppc/kernel/time.c)timer_interrupt() calls the ppc_md.set_rtc_time() when certain conditions are met, as follows (time.c:171):

if ( ppc_md.set_rtc_time && (time_status & STA_UNSYNC) == 0 &&
xtime.tv_sec - last_rtc_update >= 659 &&
abs(xtime.tv_usec - (1000000-1000000/HZ)) < 500000/HZ &&
jiffies - wall_jiffies == 1) {
if (ppc_md.set_rtc_time(xtime.tv_sec+1 + time_offset) == 0)

In the CHRP architecture (see arch/ppc/platforms/chrp_*) the specific implementation of the set_rtc_time(), chrp_set_rtc_time(), has a check like this (chrp_time.c:76):

if ( (time_state == TIME_ERROR) || (time_state == TIME_BAD) )
time_state = TIME_OK;

which is the only chance for the time_state to be set back to TIME_OK after a do_settimeofday(). In other platforms this is not done.


Proposed patch:
This change should make NTP to work on any ppc platform, while not breaking chrp and gemini. Although I've tested it only on mine.
--- linux-2.6.11/arch/ppc/kernel/time.c 2005-03-02 08:38:17.000000000 +0100
+++ linux/arch/ppc/kernel/time.c 2005-03-08 14:16:56.000000000 +0100
@@ -272,7 +272,6 @@

time_adjust = 0; /* stop active adjtime() */
time_status |= STA_UNSYNC;
- time_state = TIME_ERROR; /* p. 24, (a) */
time_maxerror = NTP_PHASE_LIMIT;
time_esterror = NTP_PHASE_LIMIT;
write_sequnlock_irqrestore(&xtime_lock, flags);


My question:
I've read some documentation but I am by no means an expert in the NTP kernel support implementation. So I ask you where the time_state should be reset to TIME_OK. Should this be done by the <platform>set_rtc_time() ?
Or, as in the x86 case, do_settimeofday should not set time_state to TIME_ERROR ?


Giovambattista Pulcini



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/