Re: Question Regarding ERMS memcpy

From: hpa
Date: Sat Mar 04 2017 - 18:56:00 EST


On March 4, 2017 3:46:44 PM PST, Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote:
>Hi Borislav,
>
>Thanks for the help.
>
>On 04/03/17 03:43 PM, Borislav Petkov wrote:
>> You can boot with "debug-alternative" and look for those strings
>where
>
>Here's the symbols for memcpy and the corresponding apply_alternatives
>lines:
>
>ffffffff8122df90 T __memcpy
>ffffffff8122df90 W memcpy
>ffffffff8122dfb0 T memcpy_erms
>ffffffff8122dfc0 T memcpy_orig
>
>[ 0.076018] apply_alternatives: feat: 3*32+16, old:
>(ffffffff8122df90, len: 5), repl: (ffffffff819d7ac5, len: 0), pad: 0
>[ 0.076019] ffffffff8122df90: old_insn: e9 2b 00 00 00
>[ 0.076021] ffffffff8122df90: final_insn: 66 66 90 66 90
>
>So it looks like it's patching in a NOP as it is supposed to.
>
>
>> Also, do
>>
>> make <path-to-file>.s
>>
>> of the file where it does memcpy_fromio() and attach it here. I'd
>like to see
>> what the compiler generates.
>
>I've attached the whole file but this is the code in question:
>
> .loc 5 219 0
> movq 936(%rbp), %rdx # stdev_3(D)->mmio_mrpc, tmp127
> leaq 48(%rbx), %rax #, tmp113
> movq 40(%rbx), %rcx # MEM[(struct switchtec_user *)__mptr_7 +
>-56B].read_len, MEM[(struct switchtec_user *)__mptr_7 + -56B].read_len
> movq %rax, %rdi # tmp113, tmp118
> leaq 1024(%rdx), %rsi #, tmp114
> rep movsb
>
>Strangely, it decided to inline a memcpy with rep movsb. Any idea why
>the compiler would do that? Is this just gcc being stupid? My gcc
>version is below.
>
>Thanks,
>
>Logan
>
>
>Using built-in specs.
>COLLECT_GCC=gcc
>COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.9/lto-wrapper
>Target: x86_64-linux-gnu
>Configured with: ../src/configure -v --with-pkgversion='Debian
>4.9.2-10'
>--with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs
>--enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
>--program-suffix=-4.9 --enable-shared --enable-linker-build-id
>--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
>--with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib
>--enable-nls --with-sysroot=/ --enable-clocale=gnu
>--enable-libstdcxx-debug --enable-libstdcxx-time=yes
>--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
>--with-system-zlib --disable-browser-plugin --enable-java-awt=gtk
>--enable-gtk-cairo
>--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64/jre
>--enable-java-home
>--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64
>--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-amd64
>--with-arch-directory=amd64
>--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc
>--enable-multiarch --with-arch-32=i586 --with-abi=m64
>--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
>--enable-checking=release --build=x86_64-linux-gnu
>--host=x86_64-linux-gnu --target=x86_64-linux-gnu
>Thread model: posix
>gcc version 4.9.2 (Debian 4.9.2-10)

For newer processors, as determined by -mtune=, it is actually the best option for an arbitrary copy.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.