Kernel newbie finds "infelicity", if not bug, in IDE[-PROBE].C ?

BROWN Nick (Nick.BROWN@coe.fr)
Tue, 20 Apr 1999 14:33:25 +0200


[If this is not the place to post this, please point me in the right
direction. I checked the list of mailing lists at Kernel Traffic and didn't
find anything more specific.]

I sent this on Saturday to Mark Lord and Gadi Oxman, who were listed as the
maintainers of IDE.C, but my mail was so vague that I hope they have just
trashed it. This time I think I'm most of the way down the tunnel, although
since it's the first time I've looked at the kernel in the seven weeks since
I installed Linux, please be gentle with me.

This problem concerns IDE.C in 2.0.36 and (98% certain; I just have to run
the 2.2 code tonight, but the source looks identical) IDE-PROBE.C in 2.2.1,
but if it does turn out to be a bug, the fix might lie in setup.S.

The problem appeared last week when I upgraded my home PC's motherboard's
BIOS. (QDI P5I430TX Titanium I, if anyone's remotely interested). My
system has:
- a 3098 MB Western Digital IDE disk
- a Symbios C810 (chip = NCR 53C8xx) SCSI controller
- a variety of SCSI disks depending on phase of the moon
The BIOS on this mobo is explicitly sympathetic to the NCR SCSI chip and can
use its disks as boot devices, for example.

After the BIOS upgrade, LILO wouldn't work, and neither would [c]fdisk.
They all complained (approx, I'm at work, so busking) that the BIOS geometry
didn't match the on-disk geometry.

I rebooted (this is all on 2.0.36, but bear with me, the problem still seems
to be there in 2.2.1 at least) and paid close attention to the messages.
When the IDE driver started, it said
hda: WDC AC23200L, 3098MB w/256kB Cache, CHS=1023/130/63, UDMA
Previously, this has been
hda: WDC AC23200L, 3098MB w/256kB Cache, CHS=787/128/63, UDMA

I downgraded to the old BIOS version - no problem. Upgraded - problem came
back. Made a note to call the motherboard manufacturer. Then I did some
arithmetic and found that 1023*130*63 matched my first SCSI drive (4091 MB).
Swapped the SCSI drives around, so the 2047 MB drive was first, and got
hda: WDC AC23200L, 3098MB w/256kB Cache, CHS=1023/65/63, UDMA

Hmmm. IDE provides the name, MB and cache size, SCSI provides the CHS.
Weird.

I got out the sources and inserted some debug statements in IDE.C. Found
that the problem was "occurring" in the routine probe_cmos_for_drives(), in
that it's there that the values, loaded from the BIOS fixed disk parameter
table at boot time in setup.S, were being placed into IDE's drive data
structures. Booted DOS and looked with DEBUG. Sure enough, the INT 41h
pseudo-vector in the new BIOS version, points to a table containing the SCSI
drive's geometry; in the old version, it's the IDE drive.

Now, I'd be happy to dismiss this as a BIOS bug and downgrade, or boot with
hda=c,h,s - but a couple of things worry me:
- There is no [adequate] sanity check. 1023*130*63 cannot be 3098 MB.
- Mark Lord's comment before probe_cmos_for_drives():
<snip>
* Of course, there is no guarantee that either drive is actually on the
* "primary" IDE interface, but we don't bother trying to sort that out
here.
* If a drive is not actually on the primary interface, then these
parameters
* will be ignored. This results in the user having to supply the logical
* drive geometry as a boot parameter for each drive not on the primary i/f.
<end snip>

Well, they aren't ignored, they are used as-is... the user doesn't get any
sort of warning like "sorry, geometry doesn't match, specify it manually".

<snip>
* The only "perfect" way to handle this would be to modify the setup.[cS]
code
* to do BIOS calls Int13h/Fn08h and Int13h/Fn48h to get all of the drive
info
* for us during initialization. I have the necessary docs -- any takers?
-ml
<end snip>

Well, I'd be happy to do it - it's a very small job, I would have thought,
especially since a cursory glance through the source suggests that the only
info which is needed anywhere is what probe_cmos_for_drives() uses.
Currently this is head, cyl, sect, and "ctl" - and here my IDE knowledge is
shown to be non-existent; how would one get the "ctl" value from a BIOS call
? If Mark or someone has the docs he mentions in the comments, I'll do it !

---------------------------------------------------------------
|\ | o _ |/ Life's like a jigsaw
| \| | |_ |\ You get the straight bits
But there's something missing in the middle

Nick Brown, Strasbourg, France (Nick(dot)Brown(at)coe(dot)fr)
---------------------------------------------------------------

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/