From: Steven Rostedt
Date: Mon May 19 2008 - 15:07:21 EST
We are pleased to announce the 188.8.131.52-rt7 tree, which can be
downloaded from the location:
Information on the RT patch can be found at:
Changes since 184.108.40.206-rt6
**** HUGE PERFORMANCE IMPROVEMENT!!! ****
This is the largest performance improvement to hit the RT patch
since the removal of the global PI lock. On my 4way box
running "hackbench 50" went from 18 seconds down to just under
5 seconds (4.8). Vanilla 220.127.116.11 on this same box runs at 3.9 secs.
This is the first time that the RT patched kernel is less than
a magnitude away from mainline running this hackbench test.
Here's a run of 10 "hackbench 50" on 18.104.22.168-rt6:
[root@bxrhel51 c]# cat hack-test-22.214.171.124-rt6-00-vanilla
The following patches are the reason for this great improvement!
- lateral lock stealing (Gregory Haskins)
[root@bxrhel51 c]# cat hack-test-126.96.36.199-rt6-01-lateral-steal
This alone brought the times down by almost 60% All this patch was to
do is allow an equal prio task (non-rt) to steal a lock from a pending
owner. This is very much similar to the problem that was recently
discovered with generic semaphores. They forced strict fairness, but
that hurts performance. We only do this with non-rt tasks, because RT
tasks need to be fair otherwise we risk a task being starved, and
even though its being starved by an equal prio RT task, I would not
want to explain that to my customers when they have two high prio
tasks bound to separate CPUS and one is starving the other.
When I first wrote the code to steal lock ownership, I originally had
lateral stealing, but notice that RT tasks were being starved by it.
Since I cared about determinism more than performance, I killed it.
But Gregory brought it back for SCHED_OTHER tasks.
- rtmutex rearrange logic (Gregory Haskins)
This patch isn't that great of performance, but sets up for adaptive
spinlocks, as well as removes an extra xchg (but adds one, see next patch)
- rtmutex remove double xchg (Steven Rostedt)
This patch removes a double xchg that happens on getting the rt_mutex.
as well as getting rid of the unneeded update_current.
No real performance benefits here.
[root@bxrhel51 c]# cat hack-test-188.8.131.52-rt6-02-rearrange-xchg
- adaptive spinlocks (Gregory Haskins, Sven Deitrich,
Peter Morreale, and Steven Rostedt)
I played a bit with different ways to do the adaptive spinlocks, but
found that guaranteeing that the highest prio task is a pain, and that
I needed to go into the slow path to handle this. Well, the guys at
Novell pretty much did that. But unfortunately, they did all sorts
of funny things (adding unneeded structures, adding stuff to
task_struct, and grabbing tasks in inappropriate places). Since I
spent quite a bit of time trying to do this, I had a good idea of
what was needed, so I rewrote their patch to what it should have
been to begin with.
Don't get me wrong, getting this to work was solely at the hands of
the Novell guys. I just had to clean it up a bit.
Here's the result:
[root@bxrhel51 c]# cat hack-test-184.108.40.206-rt6-03-adaptive-locks
to build a 220.127.116.11-rt7 tree, the following patches should be applied:
***** NOTE ******
These patches have already been ported to 2.6.25-rt. But that kernel is
still going through some needed testing.
***** NOTE *****
And like always, my RT version of Matt Mackall's ketchup will get this
for you nicely:
The broken out patches are also available.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/