Re: PREEMPT_RT and I-PIPE: the numbers, take 3

From: Ingo Molnar
Date: Thu Jun 30 2005 - 01:00:36 EST



* Kristian Benoit <kbenoit@xxxxxxxxxxx> wrote:

> This is the 3rd run of our tests.

thanks for the testing!

> Here are the changes since last time:
>
> - Modified the IRQ latency measurement code on the logger to do a
> busy- wait on the target instead of responding to an interrupt
> triggered by the target's "reply". As Ingo had suggested, this very
> much replicates what lpptest.c does. In fact, we actually copied
> Ingo's loop.
[...]

> We stand corrected as to the method that was used to collect interrupt
> latency measurements. Ingo's suggestion to disable all interrupts on
> the logger to collect the target's response does indeed mostly
> eliminate logger-side latencies. However, we've sporadically ran into
> situations where the logger locked-up, whereas it didn't before when
> we used to measure the response using another interrupt. This happened
> around 3 times in total over all of our test runs (and that's a lot of
> test runs), so it isn't systematic, but it did happen. [...]

are you timing-out based on a TSC-based deadline like lpptest does? If
an interrupt gets lost then the logger may lock up, if it's looping with
interrupts disabled.

> +--------------------+------------+------+-------+------+--------+
> | Kernel | sys load | Aver | Max | Min | StdDev |
> +====================+============+======+=======+======+========+

> +--------------------+------------+------+-------+------+--------+
> | | None | 5.7 | 47.5 | 5.7 | 0.2 |
> | | Ping | 7.0 | 63.4 | 5.7 | 1.6 |
> | with RT-V0.7.50-35 | lm. + ping | 7.9 | 66.2 | 5.7 | 1.9 |
> | | lmbench | 7.4 | 51.8 | 5.7 | 1.4 |
> | | lm. + hd | 7.3 | 53.4 | 5.7 | 1.9 |
> | | DoHell | 7.9 | 59.1 | 5.7 | 1.8 |
> +--------------------+------------+------+-------+------+--------+

> We don't know whether we've hit the maximums Ingo alluded to, but we
> did integrate his dohell script and the only noticeable difference was
> with Linux where the maximum jumped to 525.4 micro-seconds. But that
> was with vanilla only. Neither PREEMPT_RT nor I-PIPE exhibited such
> maximums under the same load.

i'm seeing roughly half of that worst-case IRQ latency on similar
hardware (2GHz Athlon64), so i believe your system has some hardware
latency that masks the capabilities of the underlying RTOS. It would be
interesting to see IRQSOFF_TIMING + LATENCY_TRACE critical path
information from the -RT tree. Just enable those two options in the
.config (on the host side), and do:

echo 0 > /proc/sys/kernel/preempt_max_latency

and the kernel will begin measuring and tracing worst-case latency
paths. Then put some load on the host when you see a 50+ usec latency
reported to the syslog, send me the /proc/latency_trace. It should be a
matter of a few minutes to capture this information.

also, i'm wondering why you tested with only 1,000,000 samples. I
routinely do 100,000,000 sample tests, and i did one overnight test with
more than 1 billion samples, and the latency difference is quite
significant between say 1,000,000 samples and 100,000,000 samples. All
you need to do is to increase the rate of interrupts generated by the
logger - e.g. my testbox can handle 80,000 irqs/sec with only 15% CPU
overhead.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/