Re: [patch] Real-Time Preemption, -RT-2.6.10-rc3-mm1-V0.7.33-0

From: Mark_H_Johnson
Date: Tue Dec 14 2004 - 18:23:29 EST


A comparison of PREEMPT_RT (with tracing) to PREEMPT_DESKTOP
(with tracing). Due to the tracing overheads (and the results
I measured in the last few days), I'll report on percentages
within 200 usec for RT and 100 usec on PK to make it "more fair".

Comparison of .33-00RT and .33-00PK results
00RT has PREEMPT_RT
00PK has PREEMPT_DESKTOP and no threaded IRQ's
2.4 has lowlat + preempt patches applied

within 200/100 usec
CPU loop (%) Elapsed Time (sec) 2.4
Test RT PK RT PK | CPU Elapsed
X 96.78 99.88 90 * 83+* | 97.20 70
top 93.87 100.00 36 * 31+ | 97.48 29
neto 99.17 100.00 340 * 193+ | 96.23 36
neti 99.13 98.39 340 * 280 * | 95.86 41
diskw 98.04 100.00? 350 * 310+ | 77.64 29
diskc 95.56 99.94 350 * 310+ | 84.12 77
diskr 90.77 99.94 220 * 310+ | 90.66 86
total 1726 1517 | 368
[higher is better] [lower is better]
* wide variation in audio duration
+ long stretch of audio duration "too fast"
? I believe I had non RT starvation & this result is in error
[chart shows a typical results for about 20 seconds and then
gets "really smooth" like the top chart for the remainder of
the measured duration]

The percentages are all within a handful of a percent so these
results look pretty comparable.

Looking at ping response time:
RT 0.226 / 0.486 / 2.475 / 0.083 msec
PK 0.102 / 0.174 / 0.813 / 0.054 msec
for min / average / max / mdev values. Again, tracing penalizes
RT much more than PK so this is to be expected.

The maximum duration of the CPU loop (as measured by the
application) is in the range of 2.05 msec to 3.30 compared
to the nominal 1.16 msec duration for -00RT. The equivalent
numbers for -00PK are 1.21 to 2.61 msec. I would expect RT
to be better than PK on this measure, but it never seems to
be the result I measure.

Based on the number of latency_trace files, -00RT still
is far better than -00PK. In particular, I get some extended
delays in -00PK from:
- network (I have an 2000 usec example!)
- rcu_process_callbacks (around 250 usec)
- clear_page_range (around 170 usec)
- free_pages_and_swap_cache (around 140 usec)
- do_no_page (around 170 usec)
- ide [IRQ?] (around 200 usec)
- journal_remove_journal_head (> 1000 usec)
- do_wait / wait_task_zombie (around 200 usec)
A fix to the network & journaling latencies would be helpful.
The others are certainly less important. I'll send the traces
separately.

Also, if you get some odd trace results on an SMP system,
Ingo already has some fixes applied in response to some buglets
I found & reported separately.

--Mark H Johnson
<mailto:Mark_H_Johnson@xxxxxxxxxxxx>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/