Re: [PATCH] poly1305: generic C can be faster on chips with slow unaligned access

From: Jason A. Donenfeld
Date: Wed Nov 02 2016 - 18:01:55 EST


On Wed, Nov 2, 2016 at 10:26 PM, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote:
> What I'm interested in is whether the new code is sufficiently
> close in performance to the old code, particularonly on x86.
>
> I'd much rather only have a single set of code for all architectures.
> After all, this is meant to be a generic implementation.

Just tested. I get a 6% slowdown on my Skylake. No good. I think it's
probably best to have the two paths in there, and not reduce it to
one.