Re: Data corruption on i.MX6 IPU in arm_copy_from_user()

From: Krzysztof Hałasa
Date: Fri May 28 2021 - 06:03:52 EST


"Russell King (Oracle)" <linux@xxxxxxxxxxxxxxx> writes:

> In any case, looking at the architecture reference manual, LDM is
> permitted on device and strongly ordered mappings, and the memory
> subsystem is required to decompose it into a series of 32-bit accesses.
> So, it sounds to me like there could be a hardware bug in the buses/IPU
> causing this.

It seems so.

I modified the kernel IPU module a bit, initialized a bunch of IPU
registers to known values (1..0xD). Results (from 1 to 13 IPU
registers) obtained with different instructions:

readl(13 consecutive registers): CSI = 1 2 3 4 5 6 7 8 9 A B C D
1 = register #0 and so on - readl() results are obviously correct.

LDM1: 1 (not corrupted)
LDM2: 1 3
LDM3: 1 3 4
LDM4: 2 3 4 4
LDM5: 1 3 4 5 6
LDM6: 1 3 4 5 6 7
LDM7: 1 3 4 5 6 7 8
LDM8: 2 3 4 5 6 7 8 8
LDM9: 1 3 4 5 6 7 8 9 A
LDM10: 1 3 4 5 6 7 8 9 A B
LDM11: 1 3 4 5 6 7 8 9 A B C
LDM12: 1 3 4 5 6 7 8 9 A B C D

The last one uses:
ldm r4, {r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip}.

I haven't tested more than 12 registers in one kernel LDMIA instruction.

The results don't depend on the address offset (adding 4, 8 or 12 to the
address doesn't change anything).

The arm_copy_from_user() is a specific case of the same corruption. It
uses a number of PLDs and 8-register LDMIAs (and then possibly LDRs
which don't fail). Each LDMIA ("LDM8") returns again:
LMD8: 2 3 4 5 6 7 8 8
(the same with subsequent LDMIAs: 10 11 12 13 14 15 16 16 and so on).

Summary: it appears all 64-bit and longer LDMIA instructions fail. The
first or the second 32-bit access is skipped (possibly somewhere between
AXI and IPU). In case of 4- and 8-register LDMs, the first (#0) value is
skipped, otherwise, it's the second (#1) value.


Now the PLDs ring a bell:
"ERR003730 ARM: 743623—Bad interaction between a minimum of seven PLDs
and one Non-Cacheable LDM can lead to a deadlock". Looking at the
disassembly I can count 6 PLDs (the first two seem to be the same,
though I don't claim I understand this (source) .s code). Also this
problem happens with IPU and not other devices, so I think it's not
related to this erratum after all.


size_t arm_copy_from_user(void *to, const void *from, size_t n)
... for n = 32 = 8 * 4 bytes:
2c: subs r2, r2, #4 ; = 28
30: blt e4 ; NOP
34: ands ip, r0, #3 ; r0 = destination
38: pld [r1]
3c: bne 108 ; NOP
40: ands ip, r1, #3 ; r1 = address in IPU
44: bne 138 ; NOP
48: subs r2, r2, #28
4c: push {r5, r6, r7, r8}
50: blt 88 ; NOP
54: pld [r1] ; duplicate PLD?
58: subs r2, r2, #0x60
5c: pld [r1, #28]
60: blt 70
64: pld [r1, #0x3c]
68: pld [r1, #0x5c]
6c: pld [r1, #0x7c]
70: ldm r1!, {r3, r4, r5, r6, r7, r8, ip, lr} ; <<<<< fails

I also wonder if STMs may have similar problems - will check.
--
Krzysztof Hałasa

Sieć Badawcza Łukasiewicz
Przemysłowy Instytut Automatyki i Pomiarów PIAP
Al. Jerozolimskie 202, 02-486 Warszawa