> I think you are mistaken. The prefetch instructions came to
> Intel CPUs with SSE. There are no (afair) no SSE Pentium II's.
Correct.
And while we are at it, any kernel guru like to take a squint at
include/asm-i386 and arch/i386 ?
It seems to me that it should be possible to get a lot more out
of P3's and P4's.
Taking the example of the (misnamed) 3DNOW page_clear code,
for the P4(SSE2) it could be implemented as :
__asm__ __volatile__ (
" pxor %%xmm0, %%xmm0\n" : :
);
for(i=0;i<4096/128;i++)
{
__asm__ __volatile__ (
" movntdq %%xmm0, (%0)\n"
" movntdq %%xmm0, 16(%0)\n"
" movntdq %%xmm0, 32(%0)\n"
" movntdq %%xmm0, 48(%0)\n"
" movntdq %%xmm0, 64(%0)\n"
" movntdq %%xmm0, 80(%0)\n"
" movntdq %%xmm0, 96(%0)\n"
" movntdq %%xmm0, 112(%0)\n"
: : "r" (page) : "memory");
page+=128;
}
/* since movntdq is weakly-ordered, a "sfence" is needed to become
* ordered again.
*/
__asm__ __volatile__ (
" sfence \n" : :
);
I'm quite willing to be flamed on this :-)
Margit
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sat Dec 07 2002 - 22:00:16 EST