Re: [CFT] faster athlon/duron memory copy implementation

From: Manfred Spraul (manfred@colorfullife.com)
Date: Thu Oct 24 2002 - 14:38:38 EST


erich@uruk.org wrote:

>>copy_page() tests
>>copy_page function 'warm up run' took 18081 cycles per page
>>copy_page function '2.4 non MMX' took 19487 cycles per page
>>copy_page function '2.4 MMX fallback' took 19403 cycles per page
>>copy_page function '2.4 MMX version' took 18086 cycles per page
>>copy_page function 'faster_copy' took 11372 cycles per page
>>copy_page function 'even_faster' took 11183 cycles per page
>>copy_page function 'no_prefetch' took 7815 cycles per page
>>1020 [maw] (buruk) /tmp/athlon # athlon_test
>>
>>
>
>
>Whoa! Hmm.
>
>If I'm reading this right, with a processor speed of 1.666 GHz,
>you're getting:
>
> (4096 bytes / 7815 clocks) * 1.666 GHz = 873 MB/sec
>
>The perfect peak performance of your setup, if the cache implements
>standard write-allocate behavior (the target cache line is read before it
>is written because the write logic doesn't know you're going to overwrite
>the whole line in cases like this), should be:
>
>
There is no write allocate.

There are 2 optimizations for bulk memory copy:
- avoid the write allocate. Possible with the mmx or sse non-temporal
cache hints
    * already in the kernel. Difference between MMX and faster_copy
- avoid dram page misses, and stream from the memory chips with maximum
efficiency.
    * new optimization. "prefetch" is a hint for the cpu that the
program might need the memory
        If I understand the AMD document correctly, then this is not
what's needed for bulk
        memory copy: we know that we'll need that cacheline. Thus a real
read, to force the cpu to
        fetch the cacheline, even if all read buffers are occupied.

--
    Manfred

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Oct 31 2002 - 22:00:24 EST