Re: [PATCH] crypto: fix ctx pointer and digest copy in sha256-mb

From: Tim Chen
Date: Fri Aug 12 2016 - 12:17:20 EST


On Fri, 2016-08-12 at 06:24 -0400, Xiaodong Liu wrote:
> 1. fix ctx pointer
> Use req_ctx which is the ctx for the next job that have
> been completed in the lanes instead of the first
> completed job rctx, whose completion could have been
> called and released.
> 2. fix digest copy
> Use XMM register to copy another 16 bytes sha256 digest
> instead of a regular register.

The changes look fine to me. ÂMegha, can you double check?

Tim

>
> Signed-off-by: Xiaodong Liu <xiaodong.liu@xxxxxxxxx>
> ---
> Âarch/x86/crypto/sha256-mb/sha256_mb.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ| 4 ++--
> Âarch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S | 7 ++++---
> Â2 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/crypto/sha256-mb/sha256_mb.c b/arch/x86/crypto/sha256-mb/sha256_mb.c
> index 89fa85e..6f97fb3 100644
> --- a/arch/x86/crypto/sha256-mb/sha256_mb.c
> +++ b/arch/x86/crypto/sha256-mb/sha256_mb.c
> @@ -485,10 +485,10 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
> Â
> Â req = cast_mcryptd_ctx_to_req(req_ctx);
> Â if (irqs_disabled())
> - rctx->complete(&req->base, ret);
> + req_ctx->complete(&req->base, ret);
> Â else {
> Â local_bh_disable();
> - rctx->complete(&req->base, ret);
> + req_ctx->complete(&req->base, ret);
> Â local_bh_enable();
> Â }
> Â }
> diff --git a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
> index b691da9..a78a069 100644
> --- a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
> +++ b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
> @@ -265,13 +265,14 @@ ENTRY(sha256_mb_mgr_get_comp_job_avx2)
> Â vpinsrd $1, _args_digest+1*32(state, idx, 4), %xmm0, %xmm0
> Â vpinsrd $2, _args_digest+2*32(state, idx, 4), %xmm0, %xmm0
> Â vpinsrd $3, _args_digest+3*32(state, idx, 4), %xmm0, %xmm0
> - movl _args_digest+4*32(state, idx, 4), tmp2_w
> + vmovdÂÂÂ_args_digest(state , idx, 4) , %xmm0
> Â vpinsrd $1, _args_digest+5*32(state, idx, 4), %xmm1, %xmm1
> Â vpinsrd $2, _args_digest+6*32(state, idx, 4), %xmm1, %xmm1
> Â vpinsrd $3, _args_digest+7*32(state, idx, 4), %xmm1, %xmm1
> Â
> - vmovdqu %xmm0, _result_digest(job_rax)
> - movl tmp2_w, _result_digest+1*16(job_rax)
> +ÂÂÂÂÂÂÂÂvmovdqu %xmm0, _result_digest(job_rax)
> +ÂÂÂÂÂÂÂÂoffset =ÂÂ(_result_digest + 1*16)
> +ÂÂÂÂÂÂÂÂvmovdqu %xmm1, offset(job_rax)
> Â
> Â pop %rbx
> Â