Re: [PATCH net 03/24] crypto: Add 'krb5enc' hash and cipher AEAD algorithm
From: Eric Biggers
Date: Fri Feb 07 2025 - 15:05:05 EST
On Mon, Feb 03, 2025 at 02:23:19PM +0000, David Howells wrote:
> [!] Note that the net/sunrpc/auth_gss/ implementation gets a pair of
> ciphers, one non-CTS and one CTS, using the former to do all the aligned
> blocks and the latter to do the last two blocks if they aren't also
> aligned. It may be necessary to do this here too for performance reasons -
> but there are considerations both ways:
>
> (1) firstly, there is an optimised assembly version of cts(cbc(aes)) on
> x86_64 that should be used instead of having two ciphers;
>
> (2) secondly, none of the hardware offload drivers seem to offer CTS
> support (Intel QAT does not, for instance).
>
> However, I don't know if it's possible to query the crypto API to find out
> whether there's an optimised CTS algorithm available.
Linux's "cts" is specifically the CS3 variant of CTS (using the terminology of
NIST SP800-38A https://dl.acm.org/doi/pdf/10.5555/2206248) which unconditionally
swaps the last two blocks. Is that the variant that is needed here? SP800-38A
mentions that CS3 is the variant used in Kerberos 5, so I assume yes. If yes,
then you need to use cts(cbc(aes)) unconditionally. (BTW, I hope you have some
test that shows that you actually implemented the Kerberos protocol correctly?)
x86_64 already has an AES-NI assembly optimized cts(cbc(aes)), as you mentioned.
I will probably add a VAES optimized cts(cbc(aes)) at some point; I've just been
doing other modes first. I don't see why off-CPU hardware offload support
should deserve much attention here, given the extremely high speed of on-CPU
crypto these days and the great difficulty of integrating off-CPU acceleration
efficiently. In particular it seems weird to consider Intel QAT a reasonable
thing to use over VAES. Regardless, absent direct support for cts(cbc(aes)) the
cts template will build it on top of cbc(aes) anyway.
- Eric