Re: [PATCH] optimize ia32 memmove

From: Andrew Morton
Date: Tue Dec 30 2003 - 02:54:37 EST


Jeff Garzik <jgarzik@xxxxxxxxx> wrote:
>
> Linux Kernel Mailing List wrote:
> > ChangeSet 1.1496.22.32, 2003/12/29 21:45:30-08:00, akpm@xxxxxxxx
> >
> > [PATCH] optimize ia32 memmove
> >
> > From: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
> >
> > The memmove implementation of i386 is not optimized: it uses movsb, which is
> > far slower than movsd. The optimization is trivial: if dest is less than
> > source, then call memcpy(). markw tried it on a 4xXeon with dbt2, it saved
> > around 300 million cpu ticks in cache_flusharray():
> [...]
> > diff -Nru a/include/asm-i386/string.h b/include/asm-i386/string.h
> > --- a/include/asm-i386/string.h Mon Dec 29 23:13:20 2003
> > +++ b/include/asm-i386/string.h Mon Dec 29 23:13:20 2003
> > @@ -299,14 +299,9 @@
> > static inline void * memmove(void * dest,const void * src, size_t n)
> > {
> > int d0, d1, d2;
> > -if (dest<src)
> > -__asm__ __volatile__(
> > - "rep\n\t"
> > - "movsb"
> > - : "=&c" (d0), "=&S" (d1), "=&D" (d2)
> > - :"0" (n),"1" (src),"2" (dest)
> > - : "memory");
> > -else
> > +if (dest<src) {
> > + memcpy(dest,src,n);
> > +} else
> > __asm__ __volatile__(
> > "std\n\t"
> > "rep\n\t"
>
> Dumb question, though... what about the overlap case, when dest<src ?
> It seems to me this change is ignoring that.
>

"if dest is less that source, then call memcpy". If the move is to a
higher address we do it the old way.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/