Re: Different Memcpy system (see memcpy by 8)

David Luyer (luyer@ucs.uwa.edu.au)
Mon, 17 May 1999 18:44:09 +0800


Alex K wrote:

> inline void setmem(void* buf,int len,unsigned char VAR){
> register void* str=buf;
> register void* max=len+str;
> register void* to;
> register int rem;
> register int newint;
> newint=(((((VAR<<8)|VAR)<<8)|VAR)<<8)|VAR;
> rem=len%8;
> to=max-rem;
> while(str<to){
> *((int*)str)=newint;
> str+=4;
> *((int*)str)=newint;
> str+=4;
> }
> while(str<max){
> *((char*)str)=VAR;
> str++;
> }
> }

If "str" isn't at least 4-byte-aligned, this will on some architectures
generate unaligned access traps, and on most architectures be inefficient.
Unaligned access traps too early in the kernel code will be unhandled and
may result in complete kernel crashes.

Also, arithmetic operations on void pointers is not ansi C, and not even
nice C. Make them char pointers at least when playing with them (max=len+str
is even the wrong way around for readability, you should make it str+len).

So you need to add even more overhead which will just be skipped in most
cases - something like:

int k = (str-NULL)%4;
if(k) {
while(k-- && str<to) {
*((char*)str)=VAR;
str++;
}
rem=(max-str)%8;
} else
rem=len%8;
to=max-rem;
...

or similar. Otherwise you've created a routine that people might call
without thinking about alignment of what they're calling it on, and end
up taking a performance hit.

David.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/