Re: POSIX violation by writeback error

From: Theodore Y. Ts'o
Date: Thu Sep 27 2018 - 10:27:54 EST


On Thu, Sep 27, 2018 at 08:43:10AM -0400, Jeff Layton wrote:
>
> Basically, the problem (as I see it) is that we can end up evicting
> uncleanable data from the cache before you have a chance to call fsync,
> and that means that the results of a read after a write are not
> completely reliable.

Part of the problem is that people don't agree on what the problem is. :-)

The original posting was from someone who claimed it was a "POSIX
violation" if a subsequent read returns *successfully*, but then the
writeback succeeds.

Other people are worried about this problem; yet others are worried
about the system wedging and OOM-killing itself, etc.

The problem is that in the face of I/O errors, it's impossible to keep
everyone happy. (You could make the local storage device completely
reliable, with a multi-million dollar storage array with remote
replication, but then the CFO won't be happy; and other people were
talking about making things work with cheap USB thumb drives and
laptops. This is the very definition of an over-constained problem.)

- Ted