Re: framebuffer corruption due to overlapping stp instructions on arm64
From: Mikulas Patocka
Date: Sat Aug 04 2018 - 09:30:45 EST
On Fri, 3 Aug 2018, Matt Sealey wrote:
> On 3 August 2018 at 13:25, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
> >
> >
> > On Fri, 3 Aug 2018, Ard Biesheuvel wrote:
> >
> >> Are we still talking about overlapping unaligned accesses here? Or do
> >> you see other failures as well?
> >
> > Yes - it is caused by overlapping unaligned accesses inside memcpy. When I
> > put "dmb sy" between the overlapping accesses in
> > glibc/sysdeps/aarch64/memcpy.S, this program doesn't detect any memory
> > corruption.
>
> It is a symptom of generating reorderable accesses inside memcpy. It's nothing
> to do with alignment, per se (see below). A dmb sy just hides the symptoms.
>
> What we're talking about here - yes, Ard, within certain amounts of
> reason - is that you cannot use PCI BAR memory as 'Normal' - certainly
> never cacheable memory, but Normal NC isn't good either.
So, are you going to map the PCI BAR as Device-nGnRE and then emulate all
the unaligned accesses in the trap handler?
Or are you going to give up on supporting PCIe graphics on ARM at all?
Videocards have linear framebuffer for 25 years. It was introduced as a
feature that simplified graphics programming a lot - programmers can use C
pointer arithmetics for drawing and they don't have to fiddle with
hardware registers. If you argue that graphics programmers can't use it
(after they have been using it for 25 years) - they will just ignore you
and ARM.
> Links is broken.
What else should it use? Are you going to introduce new functions
memcpy_to_framebuffer() and memset_framebuffer()?
> Even on Intel.
No, it's not. Intel will detect overlapping accesses.
You can write this - it is legal C code:
void g(void);
void overlapping(unsigned char *p)
{
p[0] = p[1] = p[2] = p[3] = 1;
g();
p[3] = p[4] = p[5] = p[6] = 2;
}
and the compiler compiles it to this:
overlapping:
.LFB0:
pushl %ebx
subl $8, %esp
movl 16(%esp), %ebx
movl $16843009, (%ebx)
call g
movl $33686018, 3(%ebx)
addl $8, %esp
popl %ebx
ret
Now - if the CPU is incapable of detecing the hazaard between writes to
(%ebx) and 3(%ebx) and reorders these writes, it is just broken because it
violates the C standard.
If you argue that ARM is incapable of detecting this hazaard and reorders
these two overlapping memory writes - it means that you can't use C
pointers to access videoram on ARM - which means that you can't have PCIe
graphics at all.
Mikulas