Re: [Regression] File corruptions on SSD in 1st M.2 socket of AsRock X600M-STX
From: Thorsten Leemhuis
Date: Thu Jan 09 2025 - 03:52:51 EST
[CCing the people from
https://bugzilla.kernel.org/show_bug.cgi?id=219609, as they permitted that.
Stefan, Bruno, reminder: some developers might not follow the ticket or
unwilling to go to a web-based bug tracker; so any answers to questions
that are raised here via email might not be seen if you only provide
them in the bug tracker; yes, that sucks, but that's how it is for now;
hopefully things on that front will improve soon.]
On 09.01.25 09:28, Christoph Hellwig wrote:
> On Wed, Jan 08, 2025 at 08:07:28AM -0700, Keith Busch wrote:
>> It should always be okay to do smaller transfers as long as everything
>> stays aligned the logical block size. I'm guessing the dma opt change
>> has exposed some other flaw in the nvme controller. For example, two
>> consecutive smaller writes are hitting some controller side caching bug
>> that a single larger trasnfer would have handled correctly. The host
>> could have sent such a sequence even without the patch reverted, but
>> happens to not be doing that in this particular test.
>
> Yes. This somehow reminds of the bug with an Intel SSD that got
> really upset with quickly following writes to different LBAs inside the
> same indirection unit. But as the new smaller size is nicely aligned
> that seems unlikely. Maybe the higher number of commands simply overloads
> the buggy firmware?
Thx for the assessment. FWIW, I bought such a machine myself recently
and it's still in a state where I could abandon the install. I haven't
checked yet if mine is affected, too.
> Of course the real question is why we're even seeing the limitation.
> The value suggests it's the swiotlb one. Does the system use AMD SEV
> (memory encryption)?
In case it is helpful to anyone: there are some logs buried deep in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1076372 I'm attaching
one of the kernel logs I found there (there were multiple ones; hope I
picked a appropriate one) for easier access.
Ciao, ThorstenAttachment:
kern.log-6.11.5
Description: Unix manual page