Re: [PATCH 2/4] Accelerated CRC T10 DIF computation with PCLMULQDQinstruction
From: Jussi Kivilinna
Date: Wed Apr 17 2013 - 13:58:43 EST
On 16.04.2013 19:20, Tim Chen wrote:
> This is the x86_64 CRC T10 DIF transform accelerated with the PCLMULQDQ
> instructions. Details discussing the implementation can be found in the
> paper:
>
> "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction"
> URL: http://download.intel.com/design/intarch/papers/323102.pdf
URL does not work.
>
> Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> Tested-by: Keith Busch <keith.busch@xxxxxxxxx>
> ---
> arch/x86/crypto/crct10dif-pcl-asm_64.S | 659 +++++++++++++++++++++++++++++++++
> 1 file changed, 659 insertions(+)
> create mode 100644 arch/x86/crypto/crct10dif-pcl-asm_64.S
<snip>
> +
> + # Allocate Stack Space
> + mov %rsp, %rcx
> + sub $16*10, %rsp
> + and $~(0x20 - 1), %rsp
> +
> + # push the xmm registers into the stack to maintain
> + movdqa %xmm10, 16*2(%rsp)
> + movdqa %xmm11, 16*3(%rsp)
> + movdqa %xmm8 , 16*4(%rsp)
> + movdqa %xmm12, 16*5(%rsp)
> + movdqa %xmm13, 16*6(%rsp)
> + movdqa %xmm6, 16*7(%rsp)
> + movdqa %xmm7, 16*8(%rsp)
> + movdqa %xmm9, 16*9(%rsp)
You don't need to store (and restore) these, as 'crc_t10dif_pcl' is called between kernel_fpu_begin/_end.
> +
> +
> + # check if smaller than 256
> + cmp $256, arg3
> +
<snip>
> +_cleanup:
> + # scale the result back to 16 bits
> + shr $16, %eax
> + movdqa 16*2(%rsp), %xmm10
> + movdqa 16*3(%rsp), %xmm11
> + movdqa 16*4(%rsp), %xmm8
> + movdqa 16*5(%rsp), %xmm12
> + movdqa 16*6(%rsp), %xmm13
> + movdqa 16*7(%rsp), %xmm6
> + movdqa 16*8(%rsp), %xmm7
> + movdqa 16*9(%rsp), %xmm9
Registers are overwritten by kernel_fpu_end.
> + mov %rcx, %rsp
> + ret
> +ENDPROC(crc_t10dif_pcl)
> +
You should move ENDPROC at end of the full function.
> +########################################################################
> +
> +.align 16
> +_less_than_128:
> +
> + # check if there is enough buffer to be able to fold 16B at a time
> + cmp $32, arg3
<snip>
> + movdqa (%rsp), %xmm7
> + pshufb %xmm11, %xmm7
> + pxor %xmm0 , %xmm7 # xor the initial crc value
> +
> + psrldq $7, %xmm7
> +
> + jmp _barrett
Move ENDPROC here.
-Jussi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/