memcpy() too slow?

Matthew Wilcox (willy@odie.barnet.ac.uk)
Fri, 17 Apr 1998 07:54:44 +0100 (BST)


Someone asserted the other day that memcpy() was too slow, and that keeping
a large amount of tasks paged out in order to obtain contiguous blocks was
better. Perhaps memcpy() should not be used in this case; after all it can
deal with the general byte-aligned copy whereas we only need a page-aligned
copy routine for this case. Not having a PC to hand and not being abble to
program in x86 assembler, I implemented the following function in ARM
assembler as a test. Intel CPUs probably already have a memory block copy
instruction anyway :-) The prototype would be:

void pagecpy(const void *from, void *to, unsigned int pages);

.pagecpy
STMFD r13!, {r4-r8, r14}
; 4096 bytes per page / 32 bytes per loop = 128 loops per page
MOV r2, r2, LSL #7
.loop
LDMIA r0!, {r3-r8, r12, r14}
SUBS r2, r2, #1 ; reordered for StrongARM. Neutral on others.
STMIA r1!, {r3-r8, r12, r14}
BNE loop
LDMFD r13!, {r4-r8, pc}

I used my Acorn RiscPC with an ARM610 processor to test. The CPU is clocked
at 30MHz and the bus is 32-bits wide at 16MHz. It has no secondary cache
and 4k of 4-way set-associative first level cache. It took 86 centiseconds
to copy 16MB of overlapping memory - the start addresses were &4000 and
&80000, so the chances of data being in the 4k cache were remote. I'm sure
those with faster or wider busses, EDO RAM or SDRAM and second level caches
would get better results than I. The typical Pentium has a 64-bit wide
memory bus, clocked at 33 or 50MHz, so you can divide my timings by 4 or 6
immediately.

On my machine the speed of pagecpy is 210us/page, plus-or-minus 1us, which
is approx 5 pages per millisecond. So if a driver wanted a contiguous 32k
page chunk, it would have to wait 3/5 of a millisecond for it, in the worst
case. Is this really too slow compared to the extra swapping involved with
keeping large amounts of free memory?

FWIW, I agree that this should not go into the idle thread. My idle thread
never gets the chance to run, now I've discovered rc5 :-)

-- 
Set Alias$Case Set Alias$[ |||| |MSet Alias$Otherwise Set Alias$[ \ Matthew
"" |MSet Alias$When If %0=%%0 Then Set Alias$[ "" ||MIf %0=%%0    \ Wilcox
Then Set Alias$Otherwise Set Alias$[ |||||||||||||||| ||MIf       \
%0=%%0 Then Set Alias$When Set Alias$[ ||||||||||||||||

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu