On Thu, Sep 19, 2002 at 07:01:54PM -0700, David S. Miller wrote:
> From: Andi Kleen <firstname.lastname@example.org>
> Date: Fri, 20 Sep 2002 04:06:19 +0200
> > See "montdq/movnti", the latter of which even works on register
> > registers. Ben LaHaise pointed this out to me earlier today.
> The issue is that you really want to do prefetching in these loops
> (waiting for the hardware prefetch is too slow because it needs several
> cache misses to trigger) so for cache hints on reading only prefetch
> instructions are interesting.
> I'm talking about using this to bypass the cache on the stores.
> The prefetches are a seperate issue and I agree with you on that.
I was talking generally. You cannot really use these instructions on Athlon,
because they're microcoded and slow or do not exist. On Athlon it needs
3dnow write combining functions (adding FPU overhead so may not be worth
it). On P3/P4 you can use movnti/movntdq yes.
Just doing it for reads is more tricky/dubious.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Mon Sep 23 2002 - 22:00:29 EST