Re: [PATCH RFC] [X86] performance improvement for memcpy_64.S byavoid memory miss predication.

From: Ingo Molnar
Date: Mon Oct 19 2009 - 03:17:55 EST



* linguranus@xxxxxxxxx <linguranus@xxxxxxxxx> wrote:

> From: Ling <linguranus@xxxxxxxxx>
>
> Hi All
>
> CPU will use memory disambiguration predication to speculatively read
> memory without waiting for previous write instructions and correctly
> avoid conflict between them (RAW). However it seem only to care about
> last 12 bits of address, not care about real address. For example if
> rsi is 0xf004, rdi is 0xe008, when we do following operation there
> will generate big performance latency.

Would be nice to trigger this kind of pattern via some testcase that
uses read() or write() - or some other real workload.

Then you can use 'perf stat --repeat 10 ./my-test-prog' to measure it
and post the results - the before-patch/after-patch instruction count,
cycle count, etc.

You can get 'perf' via:

cd tools/perf
make -j install

(And please preserve the Cc: line for new postings - thanks.)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/