Re: [patch] __volatile__ needed in get_cycles()?

Gabriel Paubert (paubert@iram.es)
Tue, 30 Mar 1999 11:19:36 +0200 (METDST)


On Tue, 30 Mar 1999, Tigran Aivazian wrote:

> Hi Andrea,
>
> On Mon, 29 Mar 1999, Andrea Arcangeli wrote:
>
> > +extern cycles_t inline get_cycles_ordered(void)
> > +{
> > + /* I know I should not use `register' but I can't resist ;). -Andrea */
> > + register cycles_t foo;
> > +
> > + __asm__ __volatile__ ("lock; addl $0,0(%%esp)": :);
> > + foo = get_cycles();
> > + __asm__ __volatile__("": :);
> > +
> > + return foo;
> > +}
> > +
> Yes, I like the above, I did not know you can do
>
> __asm__ __volatile__("": :);
>
> to stop compiler from re-ordering things (because I never looked at wmb()
> macro). (and I like the "pseudo-smiley" at the end :).
>
> Also, certainly bus-locked movl (or addl) is cheaper than cpuid (cpuid is
> messy and trashes your registers).

I disagree, do not forget that a bus locked transaction just does this: it
goes to the bus and performs a Read-Modify-Write operation which requires
a) arbitration for bus ownership and b) several bus clocks to perform the
operation. With CPU to bus clock ratios around 5 these days, this is tens
of CPU clocks as a minimum and is very dependant on current bus
utilization (snoop cache writebacks, other CPUs and PCI bridge
transactions). I would think that the following in a _single_ asm:

pushl %ecx
xorl %eax,%eax
pushl %ebx
cpuid
popl %ebx
rdtsc
popl %ecx

will be faster and have a more reproducible timing (especially on SMP).
The xorl instruction gives well defined return value and protects against
possible variations of CPUID timing on input value. It's about 3 bytes
longer than the solution with 'lock addl' but is unlikely to perform any
bus transaction since only uses memory locations which have a high
probability of being in the cache.

Regards,
Gabriel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/