Re: Regression: memory corruption on Atmel SAMA5D31

From: Tudor.Ambarus
Date: Mon Jun 27 2022 - 08:26:52 EST


On 6/21/22 13:46, Peter Rosin wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> 2022-06-20 at 16:22, Tudor.Ambarus@xxxxxxxxxxxxx wrote:
>>
>>>
>>> git@xxxxxxxxxx:ambarus/linux-0day.git, branch dma-regression-hdmac-v5.18-rc7-4th-attempt
>>>
>>
>> Hi, Peter,
>>
>> I've just forced pushed on this branch, I had a typo somewhere and with that fixed I could
>> no longer reproduce the bug. Tested for ~20 minutes. Would you please test last 3 patches
>> and tell me if you can still reproduce the bug?
>
> Hi!
>
> I rebased your patches onto my current branch which is v5.18.2 plus a few unrelated
> changes (at least they are unrelated after removing the previous workaround to disable
> nand-dma entirely).
>
> The unrelated patches are two backports so that drivers recognize new compatibles [1][2],
> which should be completely harmless, plus a couple of proposed fixes that happens to fix
> eeprom issues with the at91 I2C driver from Codrin Ciubotariu [3].
>
> On that kernel, I can still reproduce. It seems a bit harder to reproduce the problem now
> though. If the system is otherwise idle, the sha256sum test did not reproduce in a run of
> 150+ attempts, but if I let the "real" application run while I do the test, I get a failure rate
> of about 10%, see below. The real application burns some CPU (but not all of it) and
> communicates with HW using I2C, native UARTs and two of the four USB-serial ports
> (FTDI, with the latency set to 1ms as mentioned earlier), so I guess there is more DMA
> pressure or something? There is a 100mbps network connection, but it was left "idle"
> during this test.
>

Thanks, Peter.
I got back to the office, I'm rechecking what could go wrong.

ta