Re: framebuffer corruption due to overlapping stp instructions on arm64
From: Ard Biesheuvel
Date: Mon Aug 06 2018 - 04:10:26 EST
On 6 August 2018 at 10:02, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
>
>
> On Sun, 5 Aug 2018, Florian Weimer wrote:
>
>> On 08/04/2018 01:04 PM, Mikulas Patocka wrote:
>> > There's plenty of memcpy's in the graphics stack. No one will be rewriting
>> > all the graphics drivers because of tiny market share that ARM has in
>> > desktop computers. So if you refuse to fix things and blame everyone else,
>> > you can as well announce that you don't want to have PCIe graphics on ARM
>> > at all.
>>
>> The POWER toolchain maintainers said pretty much the same thing not too
>> long ago. I wonder how many architectures need to fail until the
>> graphics stack is finally fixed.
>>
>> Thanks,
>> Florian
>
> If you say that your architecture doesn't support unaligned accesses at
> all, there's no problem - the compiler won't generate them and the libc
> won't contain them.
>
> But if you say that your architecture supports unaligned accesses except
> for the framebuffer, then you have a problem - the compiler can't know
> which pointers point to the framebuffer and libc can't know either - you
> caused this problem by your architectural decision.
>
> You can use 'volatile' to suppress memory optimizations, but it's
> impossible to go through the whole Linux graphics stack and add volatile
> to every pointer that may point to videoram. Even if you succeesed, new
> videoram accesses without volatile will appear after a year of
> development.
>
> See for example the macros READ_ONCE and WRITE_ONCE in Linux kernel - they
> should be used when there's concurrent access to the particular variable,
> but mainstream architectures don't require them, so many kernel developers
> are omitting them in their code.
>
> If you are building a supercomputer with a particular GPU, you can force
> the GPU vendor to provide POWER-compliant drivers. If you are building a
> workstation where the user can plug any GPU, forcing developers will go
> nowhere. You have to emulate the unaligned accesses and make sure that the
> next versions of your architecture support them in hardware.
>
I have the feeling this discussion is going off the rails again.
The original report is about corruption when doing overlapping writes.
Matt Sealey said you cannot have PCI outbound windows with memory
semantics on ARM, and so you should be using device mappings (which do
not tolerate unaligned accesses)
In this context, 'device mapping' does not mean 'any non-DRAM region',
but it refers to a particular type of MMU mapping attribute defined by
the ARM architecture.
I think we can all agree that memcpy() should be usable on any region
of memory that has true memory semantics, even if it is backed by VRAM
on a graphics card.
The question is if PCIe can provide such regions on ARM.