Re: [PATCH v3 RESEND] x86: optimize memcpy_flushcache

From: Ingo Molnar
Date: Mon Sep 10 2018 - 09:18:07 EST



* Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:

> Here I resend it:
>
>
> From: Mikulas Patocka <mpatocka@xxxxxxxxxx>
> Subject: [PATCH] x86: optimize memcpy_flushcache
>
> I use memcpy_flushcache in my persistent memory driver for metadata
> updates, there are many 8-byte and 16-byte updates and it turns out that
> the overhead of memcpy_flushcache causes 2% performance degradation
> compared to "movnti" instruction explicitly coded using inline assembler.
>
> The tests were done on a Skylake processor with persistent memory emulated
> using the "memmap" kernel parameter. dd was used to copy data to the
> dm-writecache target.
>
> This patch recognizes memcpy_flushcache calls with constant short length
> and turns them into inline assembler - so that I don't have to use inline
> assembler in the driver.
>
> Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
>
> ---
> arch/x86/include/asm/string_64.h | 20 +++++++++++++++++++-
> arch/x86/lib/usercopy_64.c | 4 ++--
> 2 files changed, 21 insertions(+), 3 deletions(-)

Applied to tip:x86/asm, thanks!

I'll push it out later today after some testing.

Thanks,

Ingo