> + __asm__ __volatile__ (
> + " movq (%0), %%mm0\n"
> + " movq 8(%0), %%mm1\n"
> + " movq 16(%0), %%mm2\n"
> + " movq 24(%0), %%mm3\n"
> + " movq %%mm0, (%1)\n"
> + " movq %%mm1, 8(%1)\n"
> + " movq %%mm2, 16(%1)\n"
> + " movq %%mm3, 24(%1)\n"
> + " movq 32(%0), %%mm0\n"
> + " movq 40(%0), %%mm1\n"
> + " movq 48(%0), %%mm2\n"
> + " movq 56(%0), %%mm3\n"
> + " movq %%mm0, 32(%1)\n"
> + " movq %%mm1, 40(%1)\n"
> + " movq %%mm2, 48(%1)\n"
> + " movq %%mm3, 56(%1)\n"
> : : "r" (from), "r" (to) : "memory");
> from+=64;
> to+=64;
As all this is trying to avoid bus turnarounds (i.e. switching from
reading to writing), wouldn't it be fastest to just trust that the CPU
has at least 4k worth of cache? (and hope for the best that we don't
get interrupted in the meanwhile).
void copy_page (char *dest, char *source)
{
long *dst = (long *)dest,
*src=(long *)source,
*end= (long *)(source+PAGE_SIZE);
#if 1
register int i;
long t=0;
static long tt;
for (i=0;i<PAGE_SIZE/sizeof (long);i += cache_line_size()/sizeof(long))
/* Actually the innards of this loop should be:
(void) from[i];
however, the compiler will probably optimize that away. */
t += src[i];
tt = t;
#endif
while (src < end)
*dst++ = *src++;
}
So, this is 15 lines of C, and it'd be interesting to benchmark this
against the assembly.
I'm assuming that the "loop variable handling" is not going to
influence the overall performance: that would run at 500 - 1000MHz,
and around 1 clock cycle (1-2ns) per loop. Set this against the stalls
against the memory unit whose output buffer is full, and memory writes
that take on the order of 30 ns per 64bits.
At least, that's what I expect, however, I haven't been optimizing
cycles for quite a while....
Roger.
-- ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2137555 ** *-- BitWizard writes Linux device drivers for any device you may have! --* * There are old pilots, and there are bold pilots. * There are also old, bald pilots. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Mon May 07 2001 - 21:00:22 EST