Re: framebuffer corruption due to overlapping stp instructions on arm64

From: Mikulas Patocka
Date: Mon Aug 06 2018 - 13:09:25 EST




On Mon, 6 Aug 2018, Ard Biesheuvel wrote:

> On 6 August 2018 at 14:42, Robin Murphy <robin.murphy@xxxxxxx> wrote:
> > On 06/08/18 11:25, Mikulas Patocka wrote:
> > [...]
> >>>
> >>> None of this explains why some transactions fail to make it across
> >>> entirely. The overlapping writes in question write the same data to
> >>> the memory locations that are covered by both, and so the ordering in
> >>> which the transactions are received should not affect the outcome.
> >>
> >>
> >> You're right that the corruption couldn't be explained just by reordering
> >> writes. My hypothesis is that the PCIe controller tries to disambiguate
> >> the overlapping writes, but the disambiguation logic was not tested and it
> >> is buggy. If there's a barrier between the overlapping writes, the PCIe
> >> controller won't see any overlapping writes, so it won't trigger the
> >> faulty disambiguation logic and it works.
> >>
> >> Could the ARM engineers look if there's some chicken bit in Cortex-A72
> >> that could insert barriers between non-cached writes automatically?
> >
> >
> > I don't think there is, and even if there was I imagine it would have a
> > pretty hideous effect on non-coherent DMA buffers and the various other
> > places in which we have Normal-NC mappings of actual system RAM.
> >
>
> Looking at the A72 manual, there is one chicken bit that looks like it
> may be related:
>
> CPUACTLR_EL1 bit #50:
>
> 0 Enables store streaming on NC/GRE memory type. This is the reset value.
> 1 Disables store streaming on NC/GRE memory type.
>
> so putting something like
>
> mrs x0, S3_1_C15_C2_0
> orr x0, x0, #(1 << 50)
> msr S3_1_C15_C2_0, x0
>
> in __cpu_setup() would be worth a try.

It won't boot.

But if i write the same value that was read, it also won't boot.

I created a simple kernel module that reads this register and it has bit
32 set, all other bits clear. But when I write the same value into it, the
core that does the write is stuck in infinite loop.

So, it seems that we are writing this register from a wrong place.

Mikulas