Re: [PATCH 7/13]: PCI Err: Symbios SCSI driver recovery

From: Grant Grundler
Date: Sat Jul 02 2005 - 03:18:25 EST


On Wed, Jun 29, 2005 at 11:34:08AM -0500, Linas Vepstas wrote:
...
> requests get replayed, in a fashion similar to what would be needed
> after a host reset. In particular, there shouldn't be and (permanent)
> file system corruption because any inconsistent state on the disk
> would get over-written when the queued reqeusts get re-issued.

FS's that require some ordering (journal) should be handling
this sort of stuff already. I have the same expectations as
Linas does WRT design. FS's that don't, will have the same sort
of problems that they would have as if the OS crashed.

> FWIW, yes, I have heard of devices that "cheat", and report back that a
> transaction is complete, even though it is still pending in firmware
> somewhere, either on the host or the disk. Those devices get screwed.

See "Write Cache Enabled" (aka WCE or in HPUX speak "Immediate Reporting").
WCE must be disabled if data corruption can not be tolerated.
"Desktop" (ie unix workstations) systems typically have WCE enabled
so they look good on (stupid) performance benchmarks.

The only devices that lie about WCE have battery backed RAM buffers.
(e.g. SCSI RAID *devices* - multi-LUN, dual controller beasts)

> No doubt, this will happen to some giant banking customer,

It won't happen because of WCE. None of the major HW vendors will
sell or support HW with WCE enabled. Exactly for the reasons you
point out.

grant
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/