Re: [PATCH 1/2] FRV: Implement atomic64_t

From: Eric Dumazet
Date: Fri Jul 03 2009 - 02:06:36 EST


Eric Dumazet a écrit :
> I got a 4 x speedup on a dual quad core (Intel E5450) machine if all cpus try
> to *read* the same atomic64 location.
>
> I tried various init value and got additional 5 % speedup chosing a
> value *most probably* different than actual atomic64 one,
> like (1LL << 32), with nice asm output...
>
> static inline unsigned long long atomic64_read(atomic64_t *ptr)
> {
> unsigned long long old = (1LL << 32) ;
>
> return cmpxchg8b(&ptr->counter, old, old);
> }
>

My last suggestion would be :

static inline unsigned long long atomic64_read(const atomic64_t *ptr)
{
unsigned long long res;

asm volatile(
"mov %%ebx, %%eax\n\t"
"mov %%ecx, %%edx\n\t"
LOCK_PREFIX "cmpxchg8b %1\n"
: "=A" (res)
: "m" (*ptr)
);
return res;
}

ebx/ecx being read only, and their value can be random, they are not even
mentioned in asm constraints, so gcc is allowed to keep useful values
in these registers.

So the following (stupid) example

for (i = 0; i < 10000000; i++) {
res += atomic64_read(&myvar);
}

gives :
xorl %esi, %esi
.L2:
mov %ebx, %eax
mov %ecx, %edx
lock;cmpxchg8b myvar
addl %eax, %ecx
adcl %edx, %ebx
addl $1, %esi
cmpl $10000000, %esi
jne .L2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/