Hi,
2.4 seems to have problems scanning SCSI busses. It
looks rather like it is scanning the first bus for
every host that it finds.
My dmesg is attached. In my dual-P3 box, I have three
disks on the first channel of an on-board aic7xxx:
$ cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: QUANTUM Model: ATLAS IV 9 WLS Rev: 0909
Type: Direct-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 01 Lun: 00
Vendor: QUANTUM Model: ATLAS IV 9 WLS Rev: 0909
Type: Direct-Access ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 02 Lun: 00
Vendor: QUANTUM Model: ATLAS IV 9 WLS Rev: 0909
Type: Direct-Access ANSI SCSI revision: 03
and an old HP DAT, a CDROM and a CDRW on a Tekram 395.
The driver works just fine under 2.2, but under 2.4test8,
when I insmod it, it finds the first device (the HP DAT)
and then hangs.
SysRq-P seems to show that it's trying to register the
DAT as a disk. "dmesg" (before - the box is well dead at
this point) shows that once the software RAID sets are
setup, it "finds" the three disks again and adds the as
hdd, hde and hdf. /proc/scsi/scsi shows only three, but
/proc/partitions shows six.
More information is below in a correspondance that I had
with the 395 driver author.
Is this a familiar picture to anyone?
Matthew.
---------- Forwarded message ----------
Date: Wed, 13 Sep 2000 15:41:36 +0100 (BST)
From: Matthew Kirkwood <weejock@ferret.lmh.ox.ac.uk>
To: Kurt Garloff <garloff@suse.de>
Subject: Re: dc395 troubles
Hi Kurt,
Many thanks for the speedy reply.
> > On a fresh build of 2.4test8, when I insmod the
> > driver, the machine detects the card OK, finds the
> > first device (an old HP DAT drive) and then hangs.
> Arrgh! Can you do SysRq-P and find out what the machine is doing? I
> know that the driver does have bugs, but those in the worst case do
> lead to failed reconnections or similar, so commands time out ...
OK. The output is (copied by hand, since I can't find a serial
cable):
DC395x (TRM-S1040) SCSI driver 1.27, 2000-05-23
DC395x: Used settings: AdaptID=7
DC395 : Connectors ext50 int50 Termination: Auto High
DC395x (TRM-S1040): 1 adapters found
scsi2: Tekram DC395U/UW/F DC315/U v1.27 2000-05-23
scsi : 3 hosts
Vendor: HP .. bla bla etc etc
Type: Sequential bla bla etc etc
and then it's dead.
SysRq-P finds three places where the kernel is looping.
c0188a78-f1: (c0188a78 t sd_detect)
c01ac394-a1: (c01ac018 t scan_scsis_single)
and
c01e28ab-b2: which System.map says is somewhere in
c01e0c54 T stext_lock - maybe a stuck
spinlock?
While rebooting (again :-) I noticed something else broken,
which makes me suspect a generic kernel bug, rather than a
driver flaw. My dmesg is attached (sorry, too many RAID
sets to be able to catch the very top of it), but here is the
interesting bit:
Detected scsi disk sdd at scsi0, channel 0, id 0, lun 0
Detected scsi disk sde at scsi0, channel 0, id 1, lun 0
Detected scsi disk sdf at scsi0, channel 0, id 2, lun 0
SCSI device sdd: hdwr sector= 512 bytes. Sectors= 17942584 [8761 MB] [8.8 GB]
sdd: sdd1 sdd2 < sdd5 sdd6 sdd7 sdd8 sdd9 sdd10 sdd11 sdd12 >
SCSI device sde: hdwr sector= 512 bytes. Sectors= 17942584 [8761 MB] [8.8 GB]
sde: sde1 sde2 < sde5 sde6 sde7 sde8 sde9 sde10 sde11 sde12 >
SCSI device sdf: hdwr sector= 512 bytes. Sectors= 17942584 [8761 MB] [8.8 GB]
sdf: sdf1 sdf2 < sdf5 sdf6 sdf7 sdf8 sdf9 sdf10 sdf11 sdf12 >
For reasons known only to itself, it is detecting the disks
on the onboard aic7xxx's twice. They appear only once in
/proc/scsi/scsi (0:0:0,, 0:1:0, 0:2:0) but multiple times in
/proc/partitions (hda-c _and_ hdd-f).
I guess the SCSI scanning is busted somehow. I shall see if
I can't track it down.
Matthew.
This archive was generated by hypermail 2b29 : Fri Sep 15 2000 - 21:00:21 EST