Re: Race to power off harming SATA SSDs

From: David Woodhouse
Date: Mon May 08 2017 - 03:22:22 EST


On Sun, 2017-05-07 at 22:40 +0200, Pavel Machek wrote:
> > > NOTE: unclean SSD power-offs are dangerous and may brick the device in
> > > the worst case, or otherwise harm it (reduce longevity, damage flash
> > > blocks).ÂÂIt is also not impossible to get data corruption.
>
> > I get that the incrementing counters might not be pretty but I'm a bit
> > skeptical about this being an actual issue.ÂÂBecause if that were
> > true, the device would be bricking itself from any sort of power
> > losses be that an actual power loss, battery rundown or hard power off
> > after crash.
>
> And that's exactly what users see. If you do enough power fails on a
> SSD, you usually brick it, some die sooner than others. There was some
> test results published, some are here
> http://lkcl.net/reports/ssd_analysis.html, I believe I seen some
> others too.
>
> It is very hard for a NAND to work reliably in face of power
> failures. In fact, not even Linux MTD + UBIFS works well in that
> regards. See
> http://www.linux-mtd.infradead.org/faq/ubi.html. (Unfortunately, its
> down now?!). If we can't get it right, do you believe SSD manufactures
> do?
>
> [Issue is, if you powerdown during erase, you get "weakly erased"
> page, which will contain expected 0xff's, but you'll get bitflips
> there quickly. Similar issue exists for writes. It is solveable in
> software, just hard and slow... and we don't do it.]

It's not that hard. We certainly do it in JFFS2. I was fairly sure that
it was also part of the design considerations for UBI â it really ought
to be right there too. I'm less sure about UBIFS but I would have
expected it to be OK.

SSDs however are often crap; power fail those at your peril. And of
course there's nothing you can do when they do fail, whereas we accept
patches for things which are implemented in Linux.

Attachment: smime.p7s
Description: S/MIME cryptographic signature