Re: Interesting pentium-memcpy results

Ingo Molnar (mingo@pc7537.hil.siemens.at)
Tue, 29 Jul 1997 09:17:36 +0200 (MET DST)


On Mon, 28 Jul 1997, Benjamin Saller Bender wrote:

> Chris Evans <mailto:chris@ferret.lmh.ox.ac.uk> writes:
> >I just compared 2.1.46 vs. 2.1.46+pentium memcpy patch, and interestingly
> >enough found that the UNIX byte benchmarks tended to _drop_ a fair bit ,

> I haven't done the research to verify how well this will work under
> Linux, but for large a memcpy on the P5 we may wanna consider using the FP
> unit using double precision read and writes. I have sample code if anyone is
> interested.

sigh, this FPU trick is exactly what the 'pentium-memcpy' patch does:

+ "fildq 0x0(%2)\n\t"
+ "fildq 0x20(%2)\n\t"
+ "fildq 0x40(%2)\n\t"
+ "fildq 0x60(%2)\n\t"
+ "fildq 0x80(%2)\n\t"
+ "fildq 0xa0(%2)\n\t"
+ "fildq 0xc0(%2)\n\t"
+ "fildq 0xe0(%2)\n\t"
+ "fxch\n\t"
+ "fistpq 0xc0(%1)\n\t"
+ "fistpq 0xe0(%1)\n\t"
+ "fistpq 0xa0(%1)\n\t"
+ "fistpq 0x80(%1)\n\t"
+ "fistpq 0x60(%1)\n\t"
+ "fistpq 0x40(%1)\n\t"
+ "fistpq 0x20(%1)\n\t"
+ "fistpq 0x0(%1)\n\t"
+
+ "addl $8, %2\n\t"
+ "addl $8, %1\n\t"

-- mingo