Re: [PATCH] mtd: phram: Map RAM using memremap instead of ioremap

From: Petr Malat
Date: Tue Jun 07 2022 - 06:25:22 EST


Hi!

On Mon, May 23, 2022 at 04:09:20PM +0000, David Laight wrote:
> On x86 (which I know a lot more about) memcpy() has a nasty
> habit of getting implemented as 'rep movsb' relying on the
> cpu to speed it up.
> But that doesn't happen for uncached addresses - so you get
> very slow byte copies.

I have measured the performance with (patched) and without my
change (orig). My change improves the performance on X8664 and
arm. On Mips64 it stays the same:

Tests
=====
All runtimes are in milliseconds, average real-time of 3 runs, time
measured with bash time built-in. Measured process run in SCHED_FIFO
with priority 99. Page cache was flushed before every run, but all
involved program images were in tmpfs (no swap).
- dd r512
dd if=/dev/TESTDEV of=/dev/null bs=512
- dd r1MB
dd if=/dev/TESTDEV of=/dev/null bs=1M
- dd r512
dd of=/dev/TESTDEV if=/tmpfs/img bs=512
- dd r1MB
dd of=/dev/TESTDEV if=/tmpfs/img bs=1M
- flashcp
flashcp /tmpfs/img /dev/TESTDEV
- flasherase
flash_eraseall -q /dev/TESTDEV

Results
=======
All times are in ms

ARCH | MIPS64 | ARM | X8664
CPU | CN6335p2.2 | v7 TI K2 | Xeon D-1548
Dev. size | 32MB | 128MB | 256MB
-----------+-------+---------+-------+---------+-------+---------
in ms | Orig | Patched | Orig | Patched | Orig | Patched
dd r512 | 131 | 130 | 1101 | 543 | 22906 | 281
dd r1MB | 65 | 65 | 655 | 122 | 22715 | 70
dd w512 | 1150 | 1150 | 1136 | 1042 | 28067 | 412
dd w1MB | 104 | 104 | 396 | 244 | 27761 | 122
flashcp | 100 | 99 | 1438 | 568 | 78455 | 270
flasherase | 21 | 21 | 208 | 77 | 27707 | 57

BR,
Petr