Re: 2.6.35 Regression: Ages spent discarding blocks that weren'tused!

From: Christoph Hellwig
Date: Sat Aug 14 2010 - 07:44:08 EST


On Fri, Aug 13, 2010 at 11:15:38AM -0700, Hugh Dickins wrote:
> However, I am still not quite sure that we can already make that
> change for 2.6.35 (-stable). Can you reassure me on the question I
> raise above: if we issue a discard to a device with cache, wait for
> "completion", then issue a write into the area spanned by that
> discard, can we be certain that the write to backing store will not be
> reordered before the discard of backing store (unless the device is
> just broken)? Without a REQ_HARDBARRIER in the 2.6.35 scheme? It
> seems a very reasonable assumption to me, but I'm learning not to
> depend upon reasonable assumptions here. (By the way, it doesn't
> matter at all whether writes not spanned by the discard pass it or
> not.)

Neither the SCSI (SPC and SBC) make the cache part of the protocol
except for the commands to commit them to non-volatile storage, so
even when reordering the backing device write it must still not
reorder them vs notified completion. That's nothing specific to
discard, e.g. when a write was notified as complete a new read must
come from the cache even if it hasn't been commited to the backing
device. Now I can't guarantee that all cheap SSD firmware
implementations gets thus right for TRIM, but if one is really
that buggy we need to blacklist it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/