Re: memory & filesystem corruption under heavy load?

Gerard Roudier (groudier@iplus.fr)
Sun, 7 Apr 1996 01:10:08 +0000 (GMT)


Dear Rob,

On Sat, 6 Apr 1996, Robert L Krawitz wrote:

> I've had very poor luck with the BSD driver (through 1.7) -- lots of
> hangs and crashes and such (I don't remember offhand if I've had
> corruption -- usually my system doesn't stay up long enough to make it
> interesting). As a result, I'm very reluctant to experiment with this
> driver. Apparently a lot of people are finding it works for them, and
> they like it much better than Drew's.

> I also didn't find any performance gain at all from the 1.7 driver,
> and slightly more expensive in CPU time.

I am the first user of ncrBsd2Linux and I do'nt have any problem with it.
I use it on 2 machines, my personnal computer at home and on the work station
at work. These machines run linux-1.2.13 (more stable than 1.3.8X).
I run recent versions of linux on my personnal machine only for testing.

The only features that can improve performances against Drew's driver are:
- Tagged Command Queuing
- Wide SCSI

It'is not possible to see the advantage of Tagged Queue with the benchmarks
like Bonnie, Iozone, Test suite and Hdparm.
I have done some simulations of multi-threaded hard disk load.
I often have observed a gain less than 5 %.
However, I have measured a gain of 25 % elapsed time while copying lots of
files between 2 partitions of the same HD (Dec 1 giga byte).
It seems that HD firmware is very important for Tagged Queue.

I donnot have any Wide Controler nor Wide peripheral devices.
However, I think that we can do the same remark as above about this feature.

The most probable software problem that you can have with SCSI subsystem
are TIMEOUTS.
The current value that are defined in the middle and just-above-middle scsi
drivers are, at my opinion, too short.

For example:
- SD timeout is 7 seconds.
- A HD can spend more 2 seconds sometimes for head recalibration.
- Tagged Queueing Devices may reorder requests.
- You may have more that 4 drives on a controller (max 15).
- You may use other fast devices that load CPU, memory, bus, ...
- You may have lots of interrupts.
- Some bad scsi devices may not fairly disconnect the scsi bus.
- Etc...

Now, do you agree that 15 seconds seems to be a more reasonnable value
for SD timeout ?

To set ncrBsd2Linux in the same configuration as Drew's driver, you shall
disable Disconnection Privilege. Unfortunately, this option is not easy to
set. You have to patch manually the driver code.
Wide, Sync and CmdQueue capabilities can be disabled by editing ncr53c8xx.h
and changing some values in the table of device capabilities.

At my opinion, it is quite bad to use a SCSI bus without disconnections.

Since release 1.8, ncrBsd2Linux has an install script.
At the moment, this script does not ask any question about the configuration
of the driver and devices. Perhaps shall I add this to the script in the
next release.

Must I guess that you use Drew's driver ?
Do you use it with ALWAYS_DISCONNECT, ALWAYS_SYNCHRONOUS, and some other
10 MB/sec configuration options ?

In ncrBsd2Linux these features are enabled by default (as well as Wide).
I cannot imagine that people buye high performance capable devices and
intend to use them with low performances.

Best regards,
Gerard.