Re: [PATCH] crypto: gcm - fix cacheline sharing

From: Eric Biggers
Date: Wed May 29 2019 - 16:31:01 EST


On Wed, May 29, 2019 at 08:10:56PM +0300, Iuliana Prodan wrote:
> The generic GCM driver should ensure that whatever it passes into
> scatterlists is safe for non-cache coherent DMA.
> The issue was seen while running GCM on CAAM driver. But, since CAAM
> does not support GHASH on i.MX6, only CTR skcipher part of the GCM is
> offloaded.
> The skcipher request received by CAAM has req->src pointing to
> auth_tag[16] and req->iv pointing to iv[16]. Problem is that when
> the iv is updated (crypto API requires skcipher implementations to
> update the IV with the last ciphertext block) is written in iv[16],
> which is on the same cacheline as auth_tag[16] that was previously
> DMA mapped.
> Solution is to use a pointer, aligned to cache line, instead of auth_tag
> buffer, for encryption/decryption and then free it on completion.
>
> Link: https://lore.kernel.org/linux-crypto/20190208114459.5nixe76xmmkhur75@xxxxxxxxxxxxxxxxxxx/
> Cc: <stable@xxxxxxxxxxxxxxx> # v4.19+
> Fixes: adcbc688fe2f ("crypto: gcm - Convert to new AEAD interface")
> Suggested-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
> Signed-off-by: Iuliana Prodan <iuliana.prodan@xxxxxxx>
>
> ---
> I've checked the reproducibility of this issue starting with 4.19.y.
> ---
> crypto/gcm.c | 26 +++++++++++++++++---------
> include/crypto/gcm.h | 1 +
> 2 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/crypto/gcm.c b/crypto/gcm.c
> index 33f45a9..53e3ce5 100644
> --- a/crypto/gcm.c
> +++ b/crypto/gcm.c
> @@ -66,7 +66,7 @@ struct crypto_gcm_ghash_ctx {
>
> struct crypto_gcm_req_priv_ctx {
> u8 iv[16];
> - u8 auth_tag[16];
> + u8 *auth_tag;
> u8 iauth_tag[16];
> struct scatterlist src[3];
> struct scatterlist dst[3];
> @@ -177,19 +177,23 @@ static void crypto_gcm_init_common(struct aead_request *req)
> __be32 counter = cpu_to_be32(1);
> struct scatterlist *sg;
>
> - memset(pctx->auth_tag, 0, sizeof(pctx->auth_tag));
> + /*
> + * kzalloc alignment is at least the cache line size
> + * for non-cache coherent architectures.
> + */
> + pctx->auth_tag = kzalloc(GCM_MAX_AUTH_SIZE, GFP_KERNEL);
> memcpy(pctx->iv, req->iv, GCM_AES_IV_SIZE);
> memcpy(pctx->iv + GCM_AES_IV_SIZE, &counter, 4);
>
> sg_init_table(pctx->src, 3);
> - sg_set_buf(pctx->src, pctx->auth_tag, sizeof(pctx->auth_tag));
> + sg_set_buf(pctx->src, pctx->auth_tag, GCM_MAX_AUTH_SIZE);
> sg = scatterwalk_ffwd(pctx->src + 1, req->src, req->assoclen);
> if (sg != pctx->src + 1)
> sg_chain(pctx->src, 2, sg);
>
> if (req->src != req->dst) {
> sg_init_table(pctx->dst, 3);
> - sg_set_buf(pctx->dst, pctx->auth_tag, sizeof(pctx->auth_tag));
> + sg_set_buf(pctx->dst, pctx->auth_tag, GCM_MAX_AUTH_SIZE);
> sg = scatterwalk_ffwd(pctx->dst + 1, req->dst, req->assoclen);
> if (sg != pctx->dst + 1)
> sg_chain(pctx->dst, 2, sg);
> @@ -208,9 +212,8 @@ static void crypto_gcm_init_crypt(struct aead_request *req,
> dst = req->src == req->dst ? pctx->src : pctx->dst;
>
> skcipher_request_set_tfm(skreq, ctx->ctr);
> - skcipher_request_set_crypt(skreq, pctx->src, dst,
> - cryptlen + sizeof(pctx->auth_tag),
> - pctx->iv);
> + skcipher_request_set_crypt(skreq, pctx->src, dst, cryptlen +
> + GCM_MAX_AUTH_SIZE, pctx->iv);
> }
>
> static inline unsigned int gcm_remain(unsigned int len)
> @@ -440,6 +443,7 @@ static int gcm_enc_copy_hash(struct aead_request *req, u32 flags)
> scatterwalk_map_and_copy(auth_tag, req->dst,
> req->assoclen + req->cryptlen,
> crypto_aead_authsize(aead), 1);
> + kfree(auth_tag);
> return 0;
> }
>
> @@ -492,11 +496,15 @@ static int crypto_gcm_verify(struct aead_request *req)
> u8 *iauth_tag = pctx->iauth_tag;
> unsigned int authsize = crypto_aead_authsize(aead);
> unsigned int cryptlen = req->cryptlen - authsize;
> + int err;
>
> crypto_xor(auth_tag, iauth_tag, 16);
> scatterwalk_map_and_copy(iauth_tag, req->src,
> req->assoclen + cryptlen, authsize, 0);
> - return crypto_memneq(iauth_tag, auth_tag, authsize) ? -EBADMSG : 0;
> + err = crypto_memneq(iauth_tag, auth_tag, authsize) ? -EBADMSG : 0;
> + kfree(auth_tag);
> +
> + return err;
> }
>

So what about the other places that also pass an IV located next to the data,
like crypto/ccm.c and crypto/adiantum.c? If we're actually going to make this a
new API requirement, then we need to add a debugging option that makes the API
detect this violation so that the other places can be fixed too.

Also, doing a kmalloc() per requset is inefficient and very error-prone. In
fact there are at least 3 bugs here: (1) not checking the return value, (2)
incorrectly using GFP_KERNEL when it may be atomic context, and (3) not always
freeing the memory. Why not use cacheline-aligned memory within the request
context, so that a separate kmalloc() isn't needed?

Also, did you consider whether there's any way to make the crypto API handle
this automatically, so that all the individual users don't have to?

- Eric