Re: [PATCH v2 09/19] crypto: x86 - use common macro for FPU limit

From: Jason A. Donenfeld
Date: Wed Oct 12 2022 - 21:13:22 EST


On Wed, Oct 12, 2022 at 04:59:21PM -0500, Robert Elliott wrote:
> Use a common macro name (FPU_BYTES) for the limit of the number of bytes
> processed within kernel_fpu_begin and kernel_fpu_end rather than
> using SZ_4K (which is a signed value), or a magic value of 4096U.

Not sure I like this very much. The whole idea is that this is variable
per algorithm, since not all algorithms have the same performance
characteristics. So in that sense, it's better to put this close to
where it's actually used, rather than somewhere at the top of the file.
When you do that, it makes it seem like "FPU_BYTES" is some universal
constant, which of course it isn't.

Instead, declare this as an untyped enum value within the function. For
example:

diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c
index aaba21230528..602883eee5f3 100644
--- a/arch/x86/crypto/blake2s-glue.c
+++ b/arch/x86/crypto/blake2s-glue.c
@@ -30,7 +30,8 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block,
size_t nblocks, const u32 inc)
{
/* SIMD disables preemption, so relax after processing each page. */
- BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8);
+ enum { BLOCKS_PER_FPU = SZ_4K / BLAKE2S_BLOCK_SIZE };
+ BUILD_BUG_ON(BLOCKS_PER_FPU < 8);

if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) {
blake2s_compress_generic(state, block, nblocks, inc);
@@ -38,8 +39,7 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block,
}

do {
- const size_t blocks = min_t(size_t, nblocks,
- SZ_4K / BLAKE2S_BLOCK_SIZE);
+ const size_t blocks = min_t(size_t, nblocks, BLOCKS_PER_FPU);

kernel_fpu_begin();
if (IS_ENABLED(CONFIG_AS_AVX512) &&
diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c
index 7b3a1cf0984b..f8fd2b7025c1 100644
--- a/arch/x86/crypto/chacha_glue.c
+++ b/arch/x86/crypto/chacha_glue.c
@@ -142,12 +142,14 @@ EXPORT_SYMBOL(chacha_init_arch);
void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes,
int nrounds)
{
+ enum { BYTES_PER_FPU = SZ_4K };
+
if (!static_branch_likely(&chacha_use_simd) || !crypto_simd_usable() ||
bytes <= CHACHA_BLOCK_SIZE)
return chacha_crypt_generic(state, dst, src, bytes, nrounds);

do {
- unsigned int todo = min_t(unsigned int, bytes, SZ_4K);
+ unsigned int todo = min_t(unsigned int, bytes, BYTES_PER_FPU);

kernel_fpu_begin();
chacha_dosimd(state, dst, src, todo, nrounds);