RE: framebuffer corruption due to overlapping stp instructions on arm64

From: David Laight
Date: Fri Aug 03 2018 - 07:22:56 EST


From: Ard Biesheuvel
> Sent: 03 August 2018 10:30
...
> The discussion about whether memcpy() should rely on unaligned
> accesses, and whether you should use it on device memory is orthogonal
> to that, and not the heart of the matter IMO

Even on x86 using memcpy() on PCIe memory (maybe mmap()ed into userspace)
isn't a good idea.
In the kernel memcpy_to/fromio() ought to be a better choice but that
is just an alternate name for memcpy().

The problem on x86 is that memcpy() is likely to be implemented as
'rep movsb' on modern cpu - relying on the cpu hardware to perform
cache-line sized transfers (etc).
Unfortunately on uncached locations it has to revert to byte copies.
So PCIe transfers (especially reads) are very slow.

The transfers need to use the largest size register available.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)