RE: framebuffer corruption due to overlapping stp instructions on arm64

From: David Laight
Date: Fri Aug 03 2018 - 09:02:48 EST


From: Mikulas Patocka
> Sent: 03 August 2018 13:05
...
> > Even on x86 using memcpy() on PCIe memory (maybe mmap()ed into userspace)
> > isn't a good idea.
> > In the kernel memcpy_to/fromio() ought to be a better choice but that
> > is just an alternate name for memcpy().
> >
> > The problem on x86 is that memcpy() is likely to be implemented as
> > 'rep movsb' on modern cpu - relying on the cpu hardware to perform
> > cache-line sized transfers (etc).
> > Unfortunately on uncached locations it has to revert to byte copies.
> > So PCIe transfers (especially reads) are very slow.
> >
> > The transfers need to use the largest size register available.
> >
> > David
>
> On x86, the framebuffer is mapped as write-combining memory type, so "rep
> movsb" could merge the byte writes to larger chunks. I don't have a cpu
> with the ERMS feature - could anyone try it if rep movsb works worse or
> better than explicit writes to the framebuffer?

I don't think 'write combining' can help reads, and memcpy_to/fromio()
are likely to be used for normal memory mapped io areas.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)