Re: [PATCH 5/5] crypto: chacha20 - Fix keystream alignment for chacha20_block()

From: Ard Biesheuvel
Date: Wed Nov 22 2017 - 17:06:15 EST


On 22 November 2017 at 21:29, Eric Biggers <ebiggers3@xxxxxxxxx> wrote:
> On Wed, Nov 22, 2017 at 08:51:57PM +0000, Ard Biesheuvel wrote:
>> On 22 November 2017 at 19:51, Eric Biggers <ebiggers3@xxxxxxxxx> wrote:
>> > From: Eric Biggers <ebiggers@xxxxxxxxxx>
>> >
>> > When chacha20_block() outputs the keystream block, it uses 'u32' stores
>> > directly. However, the callers (crypto/chacha20_generic.c and
>> > drivers/char/random.c) declare the keystream buffer as a 'u8' array,
>> > which is not guaranteed to have the needed alignment.
>> >
>> > Fix it by having both callers declare the keystream as a 'u32' array.
>> > For now this is preferable to switching over to the unaligned access
>> > macros because chacha20_block() is only being used in cases where we can
>> > easily control the alignment (stack buffers).
>> >
>>
>> Given this paragraph, I think we agree the correct way to fix this
>> would be to make chacha20_block() adhere to its prototype, so if we
>> deviate from that, there should be a good reason. On which
>> architecture that cares about alignment is this expected to result in
>> a measurable performance benefit?
>>
>
> Well, variables on the stack tend to be 4 or even 8-byte aligned anyway, so this
> change probably doesn't make a difference in practice currently. But it still
> should be fixed, in case it does become a problem.
>

Agreed.

> We could certainly leave the type as u8 array and use put_unaligned_le32()
> instead; that would be a simpler change. But that would be slower on
> architectures where a potentially-unaligned access requires multiple
> instructions.
>

The access itself would be slower, yes. But given the amount of work
performed in chacha20_block(), I seriously doubt that would actually
matter in practice.