Re: [patch] voluntary-preempt-2.6.8-rc3-O5

From: Ingo Molnar
Date: Wed Aug 11 2004 - 04:08:10 EST



most of the remaining latencies look quite suspect. E.g. the
select()/tty_poll() ones:

(gnome-terminal/826): 15491us non-preemptible critical section
violated 1100 us preempt threshold starting at
add_wait_queue+0x15/0x50 and ending at add_wait_queue+0x2c/0x50

[dump_stack+23/32] dump_stack+0x17/0x20
[dec_preempt_count+60/80] dec_preempt_count+0x3c/0x50
[add_wait_queue+44/80] add_wait_queue+0x2c/0x50
[normal_poll+61/375] normal_poll+0x3d/0x177
[tty_poll+97/128] tty_poll+0x61/0x80
[do_pollfd+145/160] do_pollfd+0x91/0xa0
[do_poll+95/192] do_poll+0x5f/0xc0
[sys_poll+305/544] sys_poll+0x131/0x220
[syscall_call+7/11] syscall_call+0x7/0xb

according to the trace this latency happened in a point where it's near
impossible to happen. add_wait_queue() is just a couple of straight
instructions on UP.

do you have any powersaving mode enabled in the BIOS? SMM handlers can
introduce such latencies (low probability).

the only other possibility is either a measurement error, or some mystic
IRQ overhead. But almost all IRQs are redirected so the IRQ overhead can
be eliminated almost completely. Plus direct-IRQ overhead should also
show up via the latest preempt-timing patch. Wrt. measurement error, the
jiffies based printout ought to help somewhat.

i'm currently running a loop of mlockall-test 100MB on a 466 MHz
Celeron, and not a single blip on the radar with a 1000 usecs threshold,
after 1 hour of runtime ...

i've previously seen RDTSC (cycle-counter) weirdnesses on another box,
in userspace. To exclude this possibility i've attached yet another
patch, it tries to make all the kernel rdtsc variants more robust. Does
this patch make any difference to the latency printouts? [this patch
doesnt handle cases where the rdtsc value jumps forward in time
permanently, but it handles the occasional blips i've seen on the other
box.]

Ingo
--- linux/include/asm-i386/msr.h.orig3
+++ linux/include/asm-i386/msr.h
@@ -32,15 +32,50 @@ static inline void wrmsrl (unsigned long
wrmsr (msr, lo, hi);
}

-#define rdtsc(low,high) \
+#define __rdtsc(low,high) \
__asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))

-#define rdtscl(low) \
+#define rdtsc(low,high) do { \
+ unsigned int __low1, __high1, __low2, __high2; \
+ for (;;) { \
+ __rdtsc(__low1,__high1); \
+ __rdtsc(__low2,__high2); \
+ if (__high1 == __high2 && __low2 - __low1 < 1000) \
+ break; \
+ } \
+ low = __low2; \
+ high = __high2; \
+} while (0)
+
+#define __rdtscl(low) \
__asm__ __volatile__("rdtsc" : "=a" (low) : : "edx")

-#define rdtscll(val) \
+#define rdtscl(low) do { \
+ unsigned int __low1, __low2; \
+ for (;;) { \
+ __rdtscl(__low1); \
+ __rdtscl(__low2); \
+ if (__low2 - __low1 < 1000) \
+ break; \
+ } \
+ low = __low2; \
+} while (0)
+
+#define __rdtscll(val) \
__asm__ __volatile__("rdtsc" : "=A" (val))

+#define rdtscll(val) do { \
+ unsigned long long __val1, __val2; \
+ for (;;) { \
+ __rdtscll(__val1); \
+ __rdtscll(__val2); \
+ if (__val2 - __val1 < 1000ULL) \
+ break; \
+ } \
+ val = __val2; \
+} while (0)
+
+
#define write_tsc(val1,val2) wrmsr(0x10, val1, val2)

#define rdpmc(counter,low,high) \