Re: [PATCH] lib/crypto: blake2b: Roll up BLAKE2b round loop on 32-bit

From: Eric Biggers

Date: Fri Dec 05 2025 - 15:14:13 EST


On Fri, Dec 05, 2025 at 02:16:44PM +0000, david laight wrote:
> Note that executing two G() in parallel probably requires the source
> interleave the instructions for the two G() rather than relying on the
> cpu's 'out of order execution' to do all the work
> (Intel cpu might manage it...).

I actually tried that earlier, and it didn't help. Either the compiler
interleaved the calculations already, or the CPU did, or both.

It definitely could use some more investigation to better understand
exactly what is going on, though.

You're welcome to take a closer look, if you're interested.

- Eric