Re: [Bug 219609] File corruptions on SSD in 1st M.2 socket of AsRock X600M-STX + Ryzen 8700G

From: Stefan
Date: Mon Feb 03 2025 - 13:48:57 EST


Hi,

just got feedback from ASRock. They asked me to make a video from the
corruptions occurring on my remotely (and headless) running system.
Maybe I should make video of printing out the logs that can be found an
the Linux and Debian bug trackers ...

Seems that ASRock is unwilling to solve the problem.

Regards Stefan


Am 28.01.25 um 15:24 schrieb Stefan:
Hi,

Am 28.01.25 um 13:52 schrieb Dr. David Alan Gilbert:
Is there any characterisation of the corrupted data; last time I
looked at the bz there wasn't.

Yes, there is. (And I already reported it at least on the Debian bug
tracker, see links in the initial message.)

f3 reports overwritten sectors, i.e. it looks like the pseudo-random
test pattern is written to wrong position. These corruptions occur in
clusters whose size is an integer multiple of 2^17 bytes in most cases
(about 80%) and 2^15 in all cases.

The frequency of these corruptions is roughly 1 cluster per 50 GB written.

Can others confirm this or do they observe a different characteristic?

Regards Stefan


I mean, is it reliably any of:
    a) What's the size of the corruption?
           block, cache line, word, bit???
    b) Position?
           e.g. last word in a block or something?
    c) Data?
           pile of zero's/ff's junk/etc?

    d) Is it a missed write, old data, or partially written block?

Dave

Puh.  I'm kinda lost on what we could do about this on the Linux
side.

Because it also depends on the CPU series, a firmware or hardware issue
seems to be more likely than a Linux bug.

ATM ASRock is still trying to reproduce the issue. (I'm in contact with
them to. But they have Chinese new year holidays in Taiwan this week.)

If they can't reproduce it, they have to provide an explanation why the
issues are seen by so many users.

Regards Stefan