Re: [PATCH 0/3] Add optimized SHA-1 implementations for x86 andx86_64

From: Adrian Bunk
Date: Mon Jun 11 2007 - 16:30:34 EST


On Fri, Jun 08, 2007 at 05:42:42PM -0400, Benjamin Gilbert wrote:
> The following 3-part series adds assembly implementations of the SHA-1
> transform for x86 and x86_64. For x86_64 the optimized code is always
> selected; on x86 it is selected if the kernel is compiled for i486 or above
> (since the code needs BSWAP). These changes primarily improve the
> performance of the CryptoAPI SHA-1 module and of /dev/urandom. I've
> included some performance data from my test boxes below.
>
> This version incorporates feedback from Herbert Xu. Andrew, I'm sending
> this to you because of the (admittedly tiny) intersection with arm and s390
> in part 1.
>
> -
>
> tcrypt performance tests:
>
> === Pentium IV in 32-bit mode, average of 5 trials ===
> Test# Bytes/ Bytes/ Cyc/B Cyc/B Change
>...
> I've also done informal tests on other boxes, and the performance
> improvement has been in the same ballpark.
>
> On the aforementioned Pentium IV, /dev/urandom throughput goes from 3.7 MB/s
> to 5.6 MB/s with the patches; on the Core 2, it increases from 5.5 MB/s to
> 8.1 MB/s.
>...

With which gcc version and compiler flags?

And why is the C code slower?
Problems in the C code?
gcc problems?

Generally, I'd really prefer one C implementation that works good on all
platforms over getting dozens of different assembler implemenations,
each potentially with different bugs.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/