The FPU memcpy patch Web page
(http://www.tiac.net/users/rlk/linux.html) states, "It will do no good
on a crippled Pentium with a 32-bit memory bus (if your motherboard
can accept a single 72-pin SIMM)." It doesn't say, "May further
reduce performance on a crippled Pentium', so I applied the patch in
1.99.12 and above. The patch inserted smoothly, and there were no
apparent immediate adverse effects.
Today, I attempted to measure the performance of the patch on
my system, using "x11perf -putimage500" as a tool. Actually, I was
just trying to see how my system compared to one mentioned on another
thread in this newsgroup, but I ended up investigating the patch.
For kernels 1.99.12 and 2.0.0, I received 18-19 fps without the
patch.
Performance dropped to 14 fps with the patch. :-(
1) I hope others will run similar tests, but I suggest that
results should be sent solely to Robert Krawitz
<rlk@tiac.net> to avoid bogging down the linux-kernel list.
2) The description of the patch should include words
to the effect that it may decrease performance on
some systems. This warning should be included future
versions of the patch itself, as well as appear on the Web page.
3) If my problem is due to a factor such as my machine's
bus width, perhaps the kernel (or the configuration
process) could automatically choose the best memcpy
for a particular system based on a system startup
timing test?
1) Perhaps "make config" could run a small program to determine
the better algorithm on a particular system (after asking
whether to do so)?
4) Alternatively, perhaps the poor performance is an artifact of
the particular program I used as a test; perhaps it does a lot
of short memcpy calls, for which the patch has (I speculate)
greater setup time.
1) Maybe the patch's __generic_memcpy_fromfs and
__generic_memcpy_tofs calls should be inlined (with
non-inlined calls to __xcopy_*?)
Craig Milo Rogers