Re: [RFC PATCH] [X86/mem] Handle unaligned case by avoiding storecrossing cache line

From: Denys Vlasenko
Date: Tue Oct 12 2010 - 08:58:55 EST


On Tue, Oct 12, 2010 at 10:48 PM, <ling.ma@xxxxxxxxx> wrote:
> From: Ma Ling <ling.ma@xxxxxxxxx>
>
> In this patch we mannage to reduce penalty from crossing cache line
> on some CPU archs. There are two crossing-cache-line cases:
> read and write, but write is more expensive because of
> no cache-way predication and read-for-ownership operations
> on some archs, here we avoid sotre unaligned cases,
> another reason is shift register will cause more penalty
> on decode stages, so tolerate read.
...
> Signed-off-by: Ma Ling <ling.ma@xxxxxxxxx>
> ---
>  arch/x86/lib/memcpy_64.S |   59 ++++++++++++++++++++++++++++++++++++++++-----
>  1 files changed, 52 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
> index 75ef61e..7545b08 100644
> --- a/arch/x86/lib/memcpy_64.S
> +++ b/arch/x86/lib/memcpy_64.S
> @@ -45,7 +45,7 @@ ENTRY(memcpy)
>        /*
>         * Use 32bit CMP here to avoid long NOP padding.
>         */
> -       cmp  $0x20, %edx
> +       cmp  $0x28, %rdx

Well, look above your change. The comment says "Use 32bit CMP".
If you really want to go to 64-bit one, then change comment too.

> +       /*
> +        * We append data to avoid store crossing cache.
> +        */
> +       movq (%rsi), %rcx
> +       movq %rdi, %r8
> +       addq $8, %rdi
> +       andq $-8, %rdi
> +       movq %rcx, (%r8)
> +       subq %rdi, %r8
> +       addq %r8, %rdx
> +       subq %r8, %rsi

The comment doesn't really help to understand what you are doing here.
Maybe "Align store location to 32 bytes to avoid crossing cachelines"?

>        /*
> -        * At most 3 ALU operations in one cycle,
> -        * so append NOPS in the same 16bytes trunk.
> +        * We append data to avoid store crossing cache.
>         */

Same here.

--
vda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/