Hi,
Am 28.01.25 um 13:52 schrieb Dr. David Alan Gilbert:
Is there any characterisation of the corrupted data; last time I
looked at the bz there wasn't.
Yes, there is. (And I already reported it at least on the Debian bug
tracker, see links in the initial message.)
f3 reports overwritten sectors, i.e. it looks like the pseudo-random
test pattern is written to wrong position. These corruptions occur in
clusters whose size is an integer multiple of 2^17 bytes in most cases
(about 80%) and 2^15 in all cases.
The frequency of these corruptions is roughly 1 cluster per 50 GB written.
Can others confirm this or do they observe a different characteristic?
Regards Stefan
I mean, is it reliably any of:
a) What's the size of the corruption?
block, cache line, word, bit???
b) Position?
e.g. last word in a block or something?
c) Data?
pile of zero's/ff's junk/etc?
d) Is it a missed write, old data, or partially written block?
Dave
Puh. I'm kinda lost on what we could do about this on the Linux
side.
Because it also depends on the CPU series, a firmware or hardware issue
seems to be more likely than a Linux bug.
ATM ASRock is still trying to reproduce the issue. (I'm in contact with
them to. But they have Chinese new year holidays in Taiwan this week.)
If they can't reproduce it, they have to provide an explanation why the
issues are seen by so many users.
Regards Stefan