>Jason Lunz <j@falooley.org> writes:
>
>> I'm having troubles with the last few revisions of the new Gibbs aic7xxx
>> driver that until now has served me well. This is kernel
>> 2.4.13-preempt-lvmrc4 with the 6.2.4 scsi driver, but I saw it happen on
>> 2.4.12 with 6.2.1. I haven't tried older versions with this CD.
>
>Something similar here, but I have disks on this SCSI :-(
Actually not that similar.
>scsi0:A:0: parity error detected in Message-in phase. SEQADDR(0x1b6) SCSIRATE(
>0x0)
Here's your first clue. Parity errors occur on "broken" SCSI busses.
Your cabling or terminator is bad, you have a bent pin, or just too
much noise in and around your cabling.
>scsi0: Unexpected busfree in Message-out phase
>SEQADDR == 0x166
When we tried to tell the target about the parity error during message-in
phase, the target was unable to see our message without parity errors.
So, it had to go bus free to terminate the transaction. It is a bug
here that we didn't properly identify the active transaction and abort
it here due to some overly protective code, but that can be readily
fixed.
>scsi0:0:0:0: Attempting to queue an ABORT message
Because we failed to abort the command above, the mid-layer
timesout the command and we start recovery.
>scsi0: Dumping Card State while idle, at SEQADDR 0x7
>ACCUM = 0x33, SINDEX = 0x7, DINDEX = 0x8c, ARG_2 = 0x0
>HCNT = 0x0
>SCSISEQ = 0x12, SBLKCTL = 0x2
> DFCNTRL = 0x0, DFSTATUS = 0x29
>LASTPHASE = 0x1, SCSISIGI = 0x0, SXFRCTL0 = 0x80
>SSTAT0 = 0x5, SSTAT1 = 0xa
>STACK == 0x3, 0x105, 0x160, 0xe4
>SCB count = 20
>Kernel NEXTQSCB = 19
>Card NEXTQSCB = 19
>QINFIFO entries:
>Waiting Queue entries:
>Disconnected Queue entries: 7:5
>QOUTFIFO entries:
>Sequencer Free SCB List: 8 11 4 5 10 15 2 3 1 0 9 13 14 6 12
>Pending list: 5
>Kernel Free SCB list: 7 12 0 4 10 6 9 14 3 1 8 11 15 13 2 18 17 16
>DevQ(0:0:0): 0 waiting
>DevQ(0:1:0): 0 waiting
>(scsi0:A:0:0): Queuing a recovery SCB
>scsi0:0:0:0: Device is disconnected, re-queuing SCB
>Recovery code sleeping
>(scsi0:A:0:0): Abort Tag Message Sent
>(scsi0:A:0:0): SCB 5 - Abort Tag Completed.
>Recovery SCB completes
>Recovery code awake
>aic7xxx_abort returns 0x2002
And we abort the transaction successfully.
>scsi0:A:0: parity error detected in Message-in phase. SEQADDR(0x1b6) SCSIRATE(
>0x0)
>Kernel panic: HOST_MSG_LOOP with invalid SCB ff
We get another parity error in a state without a proper connection (probably
on the identify message after a reselection) and some more defensive
programming prevents us from handling it correctly. 8-(
The end result is that you need to fix your SCSI bus to be more reliable.
I will fix the issues you've discovered in parity error and unexpected
busfree handling in the next driver release. If your bus is working
properly, you shouldn't see these errors.
-- Justin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Wed Nov 07 2001 - 21:00:19 EST