Re: memcpy

Linus Torvalds (Linus.Torvalds@cs.helsinki.fi)
Sat, 11 Nov 1995 10:35:23 +0200


"Jay Estabrook - Alpha Migration Tools - LINUX Project": "" (Nov 10, 20:54):
>
> Linus,
>
> You mentioned in some previous mail re: the 1.3.38 release:
>
> "some memory management fixes to fix the buffer cache handling. It
> should work now, but I'll have to clean it up a bit more still.. "
>
> and...
>
> "...and a few cleanups by yours truly to fix the memcpy
> stuff, for example)"
>
> I wanted to give you some feedback on just what those accomplished.
>
> I had been observing that full kernel builds ("make dep; make") on my CABRIO
> (64Mb memory and reasonable SCSI disk) had been taking about 17+ minutes.
>
> My latest builds of NONAME and CABRIO kernels, using a 1.3.38-based booted
> kernel took 11+ minutes.
>
> Your buffer cache and memcpy "fixups" cut the time by > 30% !!!!!!!!!!!!

Actually, that's _only_ due to the memcpy() fixes. The buffer cache
fixes were a non-performance issue, and "only" fixed some behavioural
things when unlinking or truncating a file that was the backing-store
for a memory mapping.

I'm not too surprised about the 30% figure, actually: for some silly
reason the alpha kernel used to use the generic memcpy (that does a byte
at a time copies) instead of the more optimized memcpy, despite the fact
that I had actually _written_ the optimized memcpy a long time ago.

Now, the optimized memcpy() is roughly ten times faster than the
byte-at-a-time one, and _on_top_of_that_ is also much better for cache
performance. And it's used a _lot_, notably for page copying at a C-O-W
fault.

Some profiling on the x86 side shows that the kernel spends something
like 20% of its time just copying and clearing memory, so it really does
make a huge difference.

Linus