fd0 causes Kernel PANIC (a un-resolved long tale of woe)

Dan St.Andre' (grillon@m3.interserv.com)
Sat, 12 Apr 1997 10:47:02 -0500


Folks,
I'm a newbie but here is what's happening to me. **I** think it is a
hardware problem, but "... we don't have this problem on DOS ..." is causing
Linux to get a bad name.

QUESTION:
Are there any known problems (a)kernel modules (b)diskette drivers
(c)file system modules that you know of? Are they fixed in any release or rpm?

QUESTION:
Are you aware of any diskette or diskette controller hardware
configurations, (or other hardware for that matter) that causes or
contributes to these problems?

QUESTION:
How might I instrument this so that (1) **I** can get a handle on what
is really happening -- hardware vs. software, and (2) I can collect data for
others to use to resolve any real software issues.

Context:
I'm still using the RH 3.0.3 distribution. I cannot do RH4x yet because I
do not have time to sort out all the PAM, CHAP, PAP whatever that I see on
the lists.

The server is a commercial pentium tower built by Intel.
The clients are PC's with an industrial form factor called PC-104. They are
really PC's in every way except mechanically.

The Situation:
intermittent KERNEL PANIC as a result of troubles with the diskette device
[major number 002].
1) our server never has a hickup
2) our clients fail routinely
3) Yesterday, Friday 11 April 1997, we could hardly do anything at all.

What we Do:
We are using diskettes for sneaker net by typing
mount /a; cp $FILES /a; umount /a
or
mount /msdos; cp $FILES /msdos; umount /msdos
[Of course the copy might go the other way.]

The mount points are in /etc/fstab as follows:

/dev/fd0 /a ext2 user,noauto 0 0
/dev/fd0 /msdos msdos user,noauto 0 0

What we Tried:
1) we modified /etc/rc.d/init.d/syslog to make klogd more verbose
daemon klogd -c 7
2) we modified /etc/fstab to be more defensive
/dev/fd0 /aC ext2 user,noauto,errors=continue,check=strict 0 0
/dev/fd0 /msdosC msdos user,noauto,errors=continue,check=strict 0 0

Analysis:
1) diskettes written on the server with /aC mount point could not be read
on the clients without a flood of I/O complaints
2) client mount point does not seem to matter
3) things work fine for a while
4) once trouble start, we must cycle system power

During mount:
floppy0: probe failed...
...
floppy0: probe failed...
... auto retry
floppy0: probe failed...
...
floppy0: probe failed...
... success

During mount:
... requesting process appears to stall waiting for mount command completion
... diskette light is on solid
... requesting process is in D-wait
manual eject the diskette
... sometimes mount completes reporting "wrong fs-type..." error
... sometimes we get Kernel PANIC

During copy:
I/O error...
... sometimes the copy retries and succeeds
... sometimes the copy retries and fails
... sometimes we get a Kernel PANIC
============================================================================
========
"In a dragon fight, often times, the bleachers get scorched."
... let The GRILLON Group help you slay your dragons. Call Today.
============================================================================
========
Daniel M. St.Andre' The GRILLON Group
voice: 512.331.8271 Information Management
Consultants
fax: 512.331.8915 10511 Weller Drive
ofc email: grillon@interserv.com Austin, TX 78750 USA
============================================================================
========