Re: MMX performance....

Oliver Xymoron (oxymoron@waste.org)
Thu, 6 Feb 1997 16:21:48 -0600 (CST)


On 6 Feb 1997, Robert Krawitz wrote:

> In article <Pine.LNX.3.95.970206153658.11525D-100000@pc5829.hil.siemens.at> Ingo Molnar <mingo@pc5829.hil.siemens.at> writes:
>
> On Thu, 6 Feb 1997, Dale R. Worley wrote:
>
> > From what I understand, everytime you switch between MMX mode and regular
> > FP mode, 100 or so cycles are burned. If you are context switching
> > alot (any multitasking enviornment), this would seem to add up.
> >
> > Assuming that the "cycles" are fundamental CPU cycles (as opposed to
> > memory accesses, or something), that could take 1 microsecond or less
> > (depending on your clock speed), which isn't much. [...]
>
> 2.1.25 does a system call in 150 cycles and context switches in 190 cycles
> microseconds. Wanna add 100 cycles to each memory copy operation?
>
> 100 cycles are alot. And XFree86 is rendering fonts using the FPU. And we
> have the pentium memcpy patch which uses the FPU for 64 bit wide memory
> copy.
>
> Hmm. I haven't had a chance to look at the MMX instruction set, but
> I'll be shocked, SHOCKED, if the MMX instruction set doesn't have 64
> bit memory transfer instructions. Perhaps a logical alternative would
> be to implement the Pentium memcpy in terms of whichever FPU/MMX mode
> was in effect at the time.

I'd be shocked as well. Most of the core instruction times are listed as
1, and from what I can tell, it's just a hack on the already existing
functional units in the FPU, taking advantage of the fast multiplier, etc.

> The Pentium memcpy() patch, BTW, has a lot of overhead of its own; it
> dumps and restores the FPU state (when it's in use it dumps the
> registers; when not, it dumps just the rest of the state). That's why
> it's configured to operate only when the amount of data to be copied
> is large. The overhead is well worth it, though, since memory
> bandwidth on write is used so much more efficiently.

What's the break-even copy size? Your patch seems to suggest 512 or 1024
bytes.

--
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.."